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LIGAND DE THC 1 ION SYSTFM AND METHODS OF USE THEREOF 

CROSS-REFERENCE TO RELATED APPLICATIONS 

The present application is a continuation-in-part of U.S. provisional 
application serial number 60^3 1,793, filed November 26, 1996, and U.S. provisional 
5 application serial number 60/043,560, filed April 15, 1997. both of which provisional 
applications are fully incorporated herein by reference. 
BACKGROUND OF THE INVENTION 
1. Field of the Invention 

The present invention relates to a novel ligand detection system and methods 

10 of using the system to identify ligands capable of specifically binding orphan protein 
domains. The invention particularly relates to peptide ligands capable of specifically 
binding an orphan domain such as the PDZ domain of neuronal nitric oxide synthase 
(nNOS). Further provided are methods of detecting the peptide ligands and orphan 
protein domains capable of specifically binding the peptide ligands. The present 

15 invention is useful for a variety of applications including detecting peptide ligands 
with therapeutic capacity to treat human diseases. 

Thirteen billion distinct peptides were screened to determine that the nNOS- 
PDZ domain binds with nanomolar affinity to peptides ending Asp-X-Val. Preference 
for Asp at the -2 peptide position is mediated by Tyr-77 of nNOS and mutating this 

20 residue to His changes the binding specificity from Asp-X-Val to Thr-X-Val. Guided 
by the Asp-X-Val consensus, candidate nNOS interacting proteins have been 
identified including glutamate and melatonin receptors. The peptides comprising the 
consensus sequence Asp-X-Val are useful in altering the interaction of the nNOS PDZ 
domain with its cognate interacting proteins to prevent the overproduction of NO. 

25 Altering the interaction between these proteins with the peptides of the invention can 
be used to treat many neurodegenerative diseases, including stroke. ALS. Alzheimer's 
disease, Parkinson's disease and Huntington's disease. The peptides will also be 
useful for the treatment of muscular dystrophies such as Duchenne muscular 
dystrophy and motility disorders such as irritable bowel syndrome. 

30 The present invention also relates to a method of identifying the amino acid 

sequence of a peptide or protein that interacts with a protein domain of interest 
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(orphan protein domain). The disclosed Protein Interaction Network (PIN I uses an m 
vitro selection strategy that identifies the ammo acid sequences which interacts with a 
siven orphan protein domain. This sequence information is then used to search 
nucleic acid and protein sequence libraries. Interacting PINs from different orphan 
5 protein domains are assembled into an electronic resource that can be searched with 
the sequence of a protein domain of interest. 
2. Background 

All publications and patent applications herein are incorporated by reference 
to the same extent as if each individual publication or patent application w as 
10 specifically and individually indicated to be incorporated by reference. 

A fundamental area of inquiry in biology is the analysis of interactions 
between proteins. Proteins are complex macromolecules made up of covalently 
linked chains of amino acids. Each protein assumes a unique three dimensional shape 
determined principally by its sequence of amino acids. Many proteins consist of 
15 smaller units termed domains, which are continuous stretches of amino acids able to 
fold independently from the rest of the protein. Some of the important functions of 
proteins are as enzymes, polypeptide hormones, nutrient transporters, structural 
components of the cell, hemoglobins, antibodies, nucleoproteins, and components of 
viruses. 

20 Protein-protein interactions enable two or more proteins to associate. A large 

number of non-covalent bonds form between the proteins when two protein surfaces 
are precisely matched, and these bonds account for the specificity of recognition. 
Protein-protein interactions are involved, for example, in the assembly of enzyme 
subunits; in antigen-antibody reactions; in forming the supramolecular structures of 

25 ribosomes. filaments, and viruses: in transport; and in the interaction of receptors on a 
cell with growth factors and hormones. Products of oncogenes can give rise to 
neoplastic transformation through protein-protein interactions. For .imple. some 
oncogenes encode protein kinases whose enzymatic activity on cellular target proteins 
leads to the cancerous state. Another example of a protein-protein interaction occurs 

30 when a virus infects a cell by recognizing a polypeptide receptor on the surface, and 
this interaction has been used to design antiviral agents. 

Evidence has accumulated over the past years that protein-protein interactions 
-j often mediated by protein modules or domains such as src homology domain 2 
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(SH2) and src homology domain 3 (SH3). Recently a novel modular domain has heen 
identified in a diverse set of proteins that are typically associated with cell junctions, 
including synapses of the central nervous system. These novel modular domains are 
known as PDZ domains. PDZ domains have also been called "GLGF repeats" and 
5 Odisks-large homology repeats" (DHRs) and consist of about 80 ammo acids These 
domains were first identified as repeated sequences in the neuron-specific 
postsynaptic density protein (PSD-95/SAP-90), the Drosophila septate junction 
protein discs-large (dig), and the epithelial tight-junction protein zona occludens-1 
(ZOl) (K. Cho et ah Neuron, 9:929-942 (1992); S. Gomperts, Cell 84:659-662 

10 (1996)). PDZ domains occur in structural proteins of the cytoskeleton and in a 

heterogeneous family of enzymes that associate with the cytoskeleton, suggesting a 
role for PDZ domains in protein-protein interactions (C. Ponting et al., Trends in 
Biological Sciences, 20: 1 02- 1 03 ( 1 995)). Supporting this notion, the three PDZ 
domains within PSD-95 were first shown to bind the carboxy-terminal Ser/Thr-X-Val 

1 5 motif found in certain N-methyl-D-aspartate (NMDA) type glutamate receptors and in 
Shaker type potassium channel subunits (E. Kim et al.. Nature, 378:85-88 (1995); H. 
Kornau et ah. Science, 269:1737-1740 (1995)). Clustering and localizing channels at 
synaptic sites is one function of the concatenated domains (M. Sheng, Neuron, 
17:575-578 (1996)). 

20 The crystal structures of the third PDZ domains of PSD-95 and dig have been 

determined (D. Doyle et al.. Cell. 85:1067-1076 (1996): J. Cabral et al.. Nature. 
382:649-652 (1996)). The PDZ structures show a "carboxylate binding loop", 
containing the signature GLGF sequence, which interacts with the C-terminal 
carboxylate group of the peptide ligand. The peptide ligand forms main chain 

25 interactions with backbone amide groups in a conserved helix and b strand of the PDZ 
domain. A critical sequence-specific interaction has been noted between the 
threonine at the -2 position of the bound peptide and a histidine residue in the PDZ 
domain (D. Doyle et al.. Cell. 85:1067-1076 (1996)). This histidine is conserved in 
all PDZ repeats of dig, PSD-95 and related proteins. This histidine, however, is not 

30 conserved in other PDZ domains (C. Ponting et al.. Trends in Biological Sciences, 
20:102-103 (1995)) suggesting distinct peptide-binding specificities. 

Since PDZ domains mediate specific protein-protein interactions, critical 
information in understanding the biological function of PDZ containing proteins is to 
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determine physiological ligand(s) tor orphan PDZ domains. Recent evidence show s 
that interaction between the PDZ domain and peptide ligands can be regulated by 
differential affinity (B. Muller et al.. Neuron. 17:255-265 ( 1996)) and by protein 
phosphorylation (Tv Cohen Neuron. 17:759-767 (1996)). These mechanisms, 
5 however, are not adequate to explain the diversity of PDZ-target protein interactions 
in both excitable and non-excitable tissues. 

Nitric oxide (NO), an endogenous signaling molecule, plays critical roles in 
nervous, immune, and cardiovascular function (D. Bredt et al., Ann. Rev. Biochem., 
63:175-195 (1994); M. Marietta, J. Biol Chem., 268:12231-12234 (1993); S. 
10 Moncada et al., A r . Eng. J. Med., 329:2002-2012 (1993)). Physiological studies have 
demonstrated numerous functions for neuron-derived NO, produced primarily by the 
neuronal NO synthase (nNOS). However, excess nNOS activity mediates brain injury 
in cerebral ischemia and in animal models of Parkinson's disease (T. Dawson et al., 
Ann, Neurol. 32:297-31 1 (1992); P. Hantraye et al.. Nature Medicine, 2:1017-1021 
15 (1996); Z. Huang et al., Science, 265:1883-1885 (1994)). Excess nNOS activity has 
been broadly linked with many neurodegenerative diseases, motility disorders and 
muscular dystrophies, including Alzheimer's disease, Huntington's disease (see 
generally D. Bredt et al.. Nature, 351:714-718 (1991)). nNOS activity must therefore 
be tightly regulated. One level of regulation is reflected by molecular targeting of the 
20 nNOS to specific intracellular membrane domains (C. Aoki et al.. Brain Res., 620:97- 
1 13 (1993)). This subcellular localization is mediated by the N-terminus of nNOS, 
which contains a PDZ domain (J. Brenrnan et al.. Cell 82:743-752 ( 1995)). This N- 
terminal domain of nNOS interacts with the PDZ domain of a 1-syntrophin and the 
second PDZ domains of PSD-95 and PSD-93. These interactions target nNOS to 
25 synaptic sites in skeletal muscle and brain (J. Brenrnan et al.. Cell 84:757-767 

(1996)). The structural details of these PDZ-PDZ interactions are not yet known. 

Several lines of evidence suggest that additional binding partners for the PDZ 
domain of nNOS may also exist. First, not all membrane-associated nNO: : in brain is 
bound to PSD-95 and related proteins (J. Brenrnan et al.. Journal of Neuru science, 
30 (1996) (in press) unpublished observations). Also, in certain muscle diseases, nNOS 
does not interact properly with a 1-syntrophin at the skeletal muscle sarcolemma (D. 
Chao et al.. Journal of Experimental Medicine, 184:609-618 (1996)). We therefore 
sought to determine whether specific carboxylate-peptides might also associate with 
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the PDZ domain of nNOS. Identification of such peptides would facilitate the 
structure and function study of PDZ domains. Also, the //; vitro defined peptide 
sequences may help identify additional nNOS interacting proteins. 

Protein-protein interactions have heen generally studied in the past using 
5 biochemical techniques such as cross-linking, co-immunoprecipitation and co- 

fractionation by chromatography . One of the disadvantages of these techniques is that 
interacting proteins often exist in very low abundance and are, therefore, difficult to 
detect. Another major disadvantage is that these biochemical techniques involve only 
the proteins, not the genes encoding them. When an interaction is detected using 

10 biochemical methods, the newly identified protein often must be painstakingly 

isolated and then sequenced to enable the gene encoding it to be obtained. Another 
disadvantage is that these methods do not immediately provide information about 
which domains of the interacting proteins are involved in the interaction. 

In vitro determination of hgands for peptide-binding domains, such as 5H3 

15 and SH2 motifs, has been achieved using two types of random peptide libraries (A. 

Sparks et aL Methods EnzymoL 255:498-509 (1995); M. Sheng, Neuron, 17:575-578 
(1996); S. Zhou et aL Methods Enzymol, 254:523-535 (1995); and review by M 
Gallop et aL, Journal of Medicinal Chemistry. 37:1233-1251 (1994)). One strategy 
utilizes the filamentous phage coat protein to display random N-termmal peptides. By 

20 repeated rounds of affinity panning and amplification, individual interacting peptides 
can be identified by sequencing the corresponding coding region of phage DNA (A. 
Sparks et al.. Methods EnzymoL. 255:498-509 (1995)). A second approach uses 
soluble random peptides that are chemically synthesized. By affinity purification of a 
mixture of bound peptides and subsequent peptide sequencing, a population based 

25 consensus can be deduced (S. Zhou et al.. Methods EnzymoL, 254:523-535 (1995)), 
Because the phage display system only accommodates N-terminal peptides, it can not 
be used to select C-terminal peptides for the PDZ domain. Although chemical peptide 
libraries are applicable, the approach has difficulties in accommodating cysteine and 
tryptophan and does not provide individual ligand sequences. As a result, analyses of 

30 chemical libraries cannot resolve compensatory effects potentially present in peptides 
of low abundance and may miss high affinity sequences containing tryptophan and 
cysteine. Thus, it would be desirable to use a genetic strategy to screen a large pool 
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of C-terminal peptides containing all 20 amino acids to identity indiv idual PD/ 
binding peptides. 

SUMMARY OF THE INVENTION 

In one aspect, the invention relates to peptides capable of altering the 
5 interaction between the nNOS PDZ domain and the proteins which this domain 
interacts. The peptides preferably alter the interactions between the nNOS PDZ 
domain and melatonin or non-NMDA type glutamate receptors. The peptides of the 
invention are useful in the formulation of therapeutic compositions which alter 
lntermolecular binding between the nNOS PDZ domain and the proteins which this 
10 domain interacts in vivo. Via inhibition -of these interactions, the peptides of the 

invention will be useful in suppressing the production of excess levels of NO which 
are neurotoxic and contribute to myofiber necrosis. For example, the peptides of the 
invention can be used to treat many neurodegenerative diseases, including stroke, 
ALS, Alzheimer's disease, Parkinson's disease and Huntington's disease. The 
15 peptides are also useful for the treatment of muscular dystrophies such as Duchenne 
muscular dystrophy and motility disorders such as irritable bowel syndrome. 

Another aspect of the invention is to provide peptides comprising the general 
sequence D-X-V-COOH wherein D=Aspartic acid, X is variable and V=Valine. 

Another object of the invention is to provide peptides capable of altering the 
20 interaction between the nNOS PDZ domain and the proteins which this domain 
interacts which are useful as commercial laboratory or bioprocess reagents. 

Another object of the invention is to provide peptides which can be used as 
molecular probes that specifically label nNOS. For instance, the peptides of the 
invention can be labeled according to standard procedures in the art and can be used 
25 as molecular probes to detect nNOS in vivo or in vitro. 

The invention also provides a kit comprising peptides which interact with the 
PDZ domain of nNOS. 

Another aspect of the invention is isolated nucleic acid sequences that encode 
the peptides described herein. 
30 Another object of the invention is to counle a genetic system that identifies 

peptides which interact with a given protein domain (orphan protein domain) with the 
available electronic sequence databases. The genetic system provides the sequence of 
the peptide which interacts with the orphan protein domain. This sequence is then 
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used to identify proteins already present in the electronie nueleic acid and protein 
sequence databases. A Protein Interaction Network (PIN) is then assembled which 
correlates the peptide sequences which interact with a given orphan protein domain. 
Assembly of many different PINs results in the assembly of a Super Protein 
5 Interaction Network (SPIN) which will serve as an electronic extension for existing 
sequence databases. This allows the researcher to search the database with the 
sequence of a given orphan protein domain for peptide sequences which are known to 
specifically interact with a given orphan protein domain. 

The invention also relates to a peptide ligand detection system that includes a 

10 random peptide library preferably of at least about 10 6 members comprising a 
recombinant DNA vector encoding a DNA binding protein. The DNA binding 
protein is selected to specifically bind a DNA sequence on the vector. The DNA 
binding protein encoded by the DNA vector comprises a random peptide sequence 
covalently linked to the DNA binding protein as an in-frame fusion protein. The 

1 5 fusion protein is typically formatted so that the DNA vector can encode preferably at 
least about 10 6 different fusion proteins up to about 10 8 fusion proteins or more, each 
of which is capable of specifically binding the DNA sequence on the vector. The 
peptide ligand detection system further includes an orphan protein domain sequence 
immobilized on a solid support that is capable of specifically binding the random 

20 peptide of the DNA binding protein. 

Significantly, the ligand detection system of the present invention utilizes an 
immobilized orphan protein domain sequence to specifically bind the random peptide 
of the in- frame fusion protein. Typically, the orphan protein domain sequence is a 
contiguous or non-contiguous amino acid sequence within the linear sequence of a 

25 protein of interest. Sometimes the orphan protein domain sequence is referred to as a 
protein module. In contrast, prior ligand detection systems using random peptide 
libraries rely on substantially larger molecules to bind the ligand, e.g., receptors, 
antibodies, or enzymes. Exemplary orphan protein domain sequences are illustrated 
below in Figure 7. 

30 The peptide ligand detection system can further include an inducer molecule 

capable of specifically binding the DNA binding protein. Typically, the inducer 
molecule is selected to release the recombinant DNA vector from the immobilized 
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orphan protein domain sequence. In particular, the inducer molecule can be 
isopropylthio-P-D-galactoside (IPTG). 

A peptide ligand detection system in accord with the present invention can 
include one of a variety of suitable recombinant DNA vectors. That is. the 
5 recombinant DNA vectors can encode a variety of suitable DNA binding proteins and 
DNA sequences capable of being bound by the DNA binding proteins. 

For example, the DNA binding protein of the peptide ligand detection system 
can include a prokaryotic repressor protein sequence. In addition, the DNA sequence 
bound by the DNA binding protein can be a prokaryotic operator sequence. More 
10 specifically, the prokaryotic repressor protein sequence can be a lac repressor or a 
fragment thereof capable of specifically binding the DNA sequence on the 
recombinant DNA vector. In addition, the prokaryotic operator sequence can be lac O 
or a fragment thereof capable of being specifically bound by the prokaryotic repressor 
protein sequence. 

15 As noted, the recombinant DNA vectors of the random peptide library are 

formatted to express the random peptide as a fusion protein. A DNA binding protein 
of the invention typically features high avidity binding to DNA and has a region 
preferably at the C -terminus of the protein that can accept an amino acid sequence 
insertion without interfering with the DNA binding activity of the protein. The half- 

20 life of a specific binding pair formed between the DNA binding protein and the 

recombinant DNA vector must be long enough for screening to occur. In general, that 
half-life will be at least about one to four hours or longer. The half-life of the specific 
binding pair formed between the random peptide and the immobilized orphan protein 
domain will also be about one to four hours or longer. 

25 If desired, the peptide ligand detection system can include an in-frame peptide 

linker sequence, e.g., between the prokaryotic repressor protein sequence (or 
fragment) and the random peptide sequence. 

A peptide ligand detected by the present ligand detection system is capable of 
specifically binding the immobilized orphan protein domain of interest. The binding 

30 affinity (EC 50 ) of the specific binding interaction depends on several param. iers such 
as the degree of binding affinity desired and the complexity of the random peptide 
sequence. However, in general the binding affinity will be in tne micromolar to 
nanomolar range for most immobilized orphan protein domains. 
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As will be discussed more fully below, an exemplary peptide ligand in accord 
with the present invention comprises between about 3. 6. 7, 8. 9. 10, 12. 15. 20, 25, 
30, 35, 40, 50 or more amino acids. For example, the present invention provides a 
peptide ligand comprising about 3, 6, 7, 8, 9 or 10 amino acids in which the C- 
5 terminal sequence of the peptide ligand consists of the sequence D-X-Y-COOH. 
wherein D is Asp, X is any amino acid, preferably any of the 20 common natural 
amino acids, and V is Val. That peptide ligand has been found to specifically bind a 
specified orphan protein domain (PDZ). 

In general, a peptide ligand in accord with the invention has a binding affinity 
10 (EC 50 ) for an orphan protein domain preferably in the micromolar to nanomolar 
range. Preferred peptide ligands have an EC 50 in the nanomolar range. 

In particular, the immobilized orphan protein domain can be a PDZ domain 
such as those obtained from a variety of known proteins such as nitric oxide synthase 
(nNOS), post-synaptic density protein (PSD-95/SAP-90), post-synaptic density 
15 protein (PSD-93), epithelial tight-junction protein zona occludens-1 (ZOl), N-methyl- 
D-aspartate (NMD A) type glutamate receptor. Shaker-type potassium channel 
subunit and 1-syntrophin. 

The invention further provides therapeutic compositions comprising a peptide 
ligand of the present invention. The therapeutic compositions are preferably provided 
20 in a pharmaceutically acceptable vehicle, e.g. sterile and pyrogen-free. Examples of 
preferred therapeutic compositions are specified below. 

Further provided are isolated nucleic acids encoding peptide ligands of the 
present invention and particularly DNA vectors comprising the isolated nucleic acids. 

The present invention also provides a method of detecting a peptide ligand 
25 capable of specifically binding an orphan protein domain of interest. In general, the 
method includes lysing transformed cells comprising the random peptide library 
generally discussed above. The lysing is under conditions such that the DNA binding 
protein comprising the random peptide remains bound to the recombinant DNA 
vector. The method further includes the steps of contacting the fusion proteins of the 
30 random peptide library to an immobilized orphan protein domain under conditions 
conducive to specific peptide-orphan protein domain binding and isolating a 
recombinant DNA vector encoding a fusion protein that specifically binds to the 
orphan protein domain. 
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ln most casc>>. the method w ill further include the steps of transforming a ho^: 
cell with the isolated recombinant DNA vector obtained, repeating the lysmg and 
contacting steps and isolated a selected recombinant DNA vector. As will be shown 
below in the examples, practice of this method leads to amplification of the selected 
5 recombinant DNA vector. 

The method will also typically includes the steps of determining the amino 
acid sequence of the random peptide encoded by the selected recombinant DNA 
vector, and searching a protein sequence database to identify an orphan protein 
domain in the database comprising the random peptide. 
10 If desired, the method can further include the step of assembling a protein 

interaction network (PIN) sufficient to correlate (particularly match) a plurality of 
random peptide sequences to the orphan protein domain. In this method, the plurality 
of random peptide sequences are capable of binding the correlated orphan protein 
domain with a binding affinity in the micromolar to nanomolar range as noted below. 
15 The method can further include assembling a super protein interaction network 

(SPINS) comprising a plurality of protein interaction networks (PINs) sufficient to 
serve as an electronic extension database for the protein sequence database. 

Typically, the assembly is assisted by one or more suitable computer programs 
such as those generally known in the field for compiling protein and/or nucleic 
20 sequences in a matrix or matrix-type format. The matrix or matrix-type format can be 
readily searched with a test sequence that can be. e.g., a peptide ligand sequence or 
orphan domain sequence in accord with the invention. 

The invention further provides a method of detecting a peptide ligand capable 
of specifically binding an orphan protein domain of interest, the method comprising 
25 searching the super protein interaction network (SPINS) with an amino acid sequence 
comprising an orphan protein domain of interest, and identifying the peptide ligand 
capable of specifically binding the orphan protein domain of interest. The peptide 
ligand can be obtained from any suitable source such as any of the random peptide 
libraries discussed previously. 
30 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a schematic diagram showing affinity selection from a C-termmal 

peptide library. 



WO 98/23781 PCT/IIS97/21861 

- 1 1 - 

Figures 2 A is a eraph showing affinity selection of peptides interacting with 
PDZ3 of PSD-95 by ELISA. 

Figure 2B is an alignment of deduced amino acid sequences of PDZ3 specific 
clones. Eleven clones were randomly chosen and sequenced. Single letter code for 
5 20 amino acids are used. Italic letters indicate amino acids present at the end of the 
linker which separates Lac I and the fused peptide. indicates a stop codon. 

Figure 3 A is a graph showing \n vitro selection of peptides interacting with 
nNOS-PDZ. The graph shows identification of nNOS-PDZ interacting clones by 
ELISA. After 4 rounds of affinity panning, a total of 150 individual clones were 
10 randomly selected and tested for interaction with nNOS-PDZ by ELISA as described 
in experimental procedures. Clones 1 to 48 are shown (horizontal axis). Gray bars: 
BSA; open bars: GST-NABherg + BSA; closed bars: GST-nNOS-PDZ+ BSA. 

Figure 3B and 3C illustrate a sequence alignment of 95 independent nNOS 
binding peptides (NBPs). The deduced amino acid sequence of the clones were 
1 5 obtained and aligned according to the first stop codon (*). The italic Gs are part of 
linker region. The library template (GGG-X15-*) is shown at the top of the sequence 
alignment. 

Figures 4A -41 are graphs showing determinations of a consensus nNOS 
binding peptide (NBP). Normalized amino acid abundance of the final nine residues 
20 from the population of 95 independent nNOS binding peptides (closed bars) is 

compared in each figure with codon frequency in the original library (open bars). 
Residues in the library linker region were not included in each figure. 

Fieure 5A is a graph showing all 95 NBPs fail to interact with PDZ3. ELISA 
results of 36 randomly chosen NBP clones are shown. Horizontal axis: NBP clone 
25 number; vertical axis: ELISA signal normalized against clones with strongest binding. 

Figure 5B is a graph illustrating that mutating Y77D78 to H77E78 changes the 
nNOS PDZ binding specificity from D-X-V to T-X-V. ELISA results of two high 
affinity peptides are shown. NBP-161 for nNOS (EC 5 o=~-8 nM) and PD-325 for 
PDZ3 (EC— 2 nM) are expressed as maltose binding protein fusion and affinity 
30 purified on amy lose agarose beads (see Experimental Procedures). 

Figure 5C is a graph showing that the aspartate at the -2 position is critical for 
NBP binding. Single amino acid substitutions at the -2 position were obtained. The 
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peptides were expressed at maltose binding protein fusion at the C-terminus (see 
Experimental procedures). ELISA results of seven mutants are show n. 

Figure 5D is a representation of a W estern immunoblot. Solubilized brain 
extracts were incubated with amylose resin alone (lane 1 ). amy lose resin saturated 
5 with a maltose binding protein fusion containing a C-terminal NPB-123 (lane 2) or 
with the same fusion protein in which the -2 aspartate was changed to threonine (lane 
3). The beads were washed and retention of nNOS was detected by western blotting. 
Molecular weight standards in kDa are marked on the left. 

Figure 6 is a schematic diagram showing that functional nNOS PDZ has a 
10 uniquely large structure. The location of the PDZ domain is shown in the N-terminus 
of nNOS. Interaction of nNOS with the PDZ domains of PSD-93 requires amino 
acids 16-130 of nNOS. Association of nNOS fusions with PSD-93 was evaluated by 
the yeast two hybrid system and is expressed as p-galactosidase units. Interactions of 
five different NBPs (#64-68) with nNOS fusions were evaluated by ELISA and is 
1 5 expressed as normalized 0D405. 

Figure 7 is a list of known orphan protein domains (common protein 
modules). 

Figures 8A-8R show results of search (scan) of a non-redundant protein 
sequence database (Genbank) identifying protein sequences comprising the -D-X-V- 

20 COOH sequence where D is Asp, X is any of the 20 common amino acids, and V is 
VaL Identified protein sequences are listed in bold script and are grouped according 
to species (human, mouse, rat, etc.). Various descriptors accompany each identified 
protein sequence in accord with nomenclature adopted by Genbank. 
DETAILED DESCRIPTION OF THE INVENTION 

25 Unless defined otherwise, all technical and scientific terms used herein have 

the same meaning as commonly understood by one of ordinary skill in the art to 
which this invention belongs. Although any methods and materials similar or 
equivalent to those described herein can be used in the practice or testing of the 
present invention, the preferred methods and materials are described. For purposes of 

30 the present invention, the following terms are defined below. 

In the polypeptide notation used herein, the left-hand direction is the amino 
terminal direction and the right-hand direction is the carboxy-terminal direction, in 
accordance with standard usage and convention. Similarly, unless specified 
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otherwise, the left-hand end of single-stranded polynucleotide sequences ls-the 5' end; 
the left-hand direction of double-stranded polynucleotide sequences is referred to as 
the 5' direction. The direction of 5' to 3' addition of nascent RN'A transcripts is 
referred to as the transcription direction; sequence regions on the DNA strand having 
5 the same sequence as the RNA and which are 5' to the 5' end of the RNA transcript 
are referred to as "upstream sequences"; sequence regions on the DNA strand having 
the same sequence as the RNA and which are 3' to the 3' end of the RNA transcript 
are referred to as "downstream sequences". 

The term "protein interaction inhibitor" is used herein to refer to an agent 
10 which is identified by one or more screening method(s) of the invention as an agent 
which selectively inhibits protein-protein binding between a first interacting 
polypeptide and a second interacting polypeptide. Some protein interaction inhibitors 
may have therapeutic potential as drugs for human use and or may serve as 
commercial reagents for laboratory research or bioprocess control. Protein interaction 
15 inhibitors which are candidate drugs are then tested further for activity in assays 

which are routinely used to predict suitability for use as human and veterinary drugs, 
including in vivo administration to non-human animals and often including 
administration to human in approved clinical trials. 

As used herein, the term "operably linked" refers to a linkage of 
20 polynucleotide elements in a functional relationship, A nucleic acid is "operably 
linked" when it is placed into a functional relationship with another nucleic acid 
sequence. For instance, a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcription of the coding sequence. 

Operably linked means that the DNA sequences being linked are typically 
25 contiguous and, where necessary to join two protein coding regions, contiguous and in 
reading frame. However, since enhancers generally function when separated from the 
promoter by several kilobases and intronic sequences may be of variable lengths, 
some polynucleotide elements may be operably linked but not contiguous. 

As used herein, the term "orphan protein domain" refers to any domain of a 
30 protein which binds or interacts with another protein, particularly but not limited to 
PDZ domains. Orphan protein domains are typically contiguous stretches of amino 
acids that facilitate protein-protein interactions. Orphan protein domains, however, 
do include domains comprising non-contiguous stretches of amino acids that through 
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seeondarv and tertiary structure are brought into association to facilitate protem- 
protein interactions. Protein-protein interactions typically comprise but are not 
limited to, non-covalent bonds that account for the specificity of interaction betw een 
two proteins. Examples of such non-covalent bonds include van der Waals contacts. 
5 hydrogen bonds and salt bridges. Examples of known orphan protein domains are set 
forth in Figure 7. 

Preferred orphan protein domains have a length of between about 1 to 1000 
amino acids, preferably about 1 to 500 amino acids, and more preferably about 1 to 
100 amino acids. Particularly preferred orphan protein domains include more than 

10 one amino acid and are capable of specifically binding a peptide ligand with a binding 
affinity (EC 50 ) of between about 0.001 to 100 u.M, preferably 0.2 to IfiM and more 
preferably 8 to 100 nM as defined by any suitable immunological assay such as 
Western blotting, ELISA, RIA, gel mobility shift assay, enzyme immunoassay, 
competitive assays, saturation assays or other suitable protein binding assays known 

15 in the field and specified below. See generally Ausubel et ah. Current Protocols in 

Molecular Biology\ John Wiley & Sons, New York (1989), Sambrook et al. infra, and 
Harlow and Lane Antibodies: A Laboratory Manual, CSH Publications, N.Y. (1988), 
for disclosure relating to suitable methods for detecting specific binding between 
proteins. 

20 A "DNA binding protein" as the term is used herein, refers to a protein that 

specifically binds a DNA strand and preferably two DNA strands of the recombinant 
DNA vector. More preferably, the DNA binding protein specifically binds to the 
specific DNA sequence included in the vector. In embodiments of the invention in 
which RNA vectors are used, DNA binding protein can also refer to an RNA binding 

25 protein. 

Suitable DNA binding proteins are known in the field. For example, suitable 
prokaryotic DNA binding proteins include lac repressor, phage 434 repressor, lambda 
phage cl and cro repressors, phage P22 Arc and Mnt repressors, and CAP protein. 
Also included are eukaryotic DNA binding proteins such as those comprising 
30 homoeoboxes with helix-turn-helix motifs, proteins including helix-loop-heiix 

structures particularly myc: fos, jun and other proteins including leucine zippers and 
DNA binding domains, POU domain proteins, TFIIIA, and yeast Gal4 protein. 
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Preferably, the DNA binding protein is the lac repressor particularly the 37 
kDa protein encoded by E. coli lac 1 gene capable of repressing transcription from the 
lacZYA operon by binding to a specific DNA sequence termed lacO. See e.g., 
Aububel et al. supra: Sambrook et al., supra: Knight et ah J. Biol. Chcm. 264:3639- 
5 3642 (1989); Beyreuther in The Operon (Miller and Reznikoff eds. Cold Spring 
Harbor Laboratory (1980)). 

A "host cell" as the term is used herein is a eukaryotic or prokaryotic cell or 
cell eroup that is capable of being transformed by a recombinant DNA vector. 
Preferably, the host cell is a suitable bacterial strain such as E. coli K12. 
10 A "peptide ligand" refers to a molecule and particularly a peptide such as a 

random peptide that is capable of being specifically bound by an immobilized orphan 
protein domain. In addition, the peptide ligand is capable of being bound by the 
orphan protein domain as it exists in a protein. Preferably, the binding affinity (EC 50 ) 
between the peptide ligand and the immobilized orphan protein domain is between 
15 about 0.001 to 100 [aM, preferably 0.2 to luM and more preferably 8 to 100 nM as 
determined by a suitable binding assay as described herein. 

By the term "specific binding" or similar term is meant a molecule disclosed 
herein which binds another molecule, thereby forming a specific binding pair, but 
which does not recognize and bind to other molecules as determined by, e.g.. Western 
20 blotting, ELISA. RIA, gel mobility shift assay, enzyme immunoassay, competitive 
assays, saturation assays or other suitable protein binding assays known in the field.. 

By the term "immobilized orphan protein domain" is meant an amino acid 
sequence corresponding to a desired orphan protein domain that has been covalently 
or non-covalently bound to a solid support or surface such as a particle or a dish. If 
25 desired the immobilized orphan protein can be immobilized by attaching an 

immunologically recognizable ligand, e.g., biotin, bound to streptavidin which is 
attached to the solid support or surface. If desired, the ligand may be attached by a 
peptide linker sequence. 

Exemplary peptide linker sequences in accord with the invention comprise up 
30 to 20 amino acids, preferably up to about 10 amino acids, and more preferably from 
about 1 to 5 amino acids. The linker sequence is generally flexible so as not hold the 
random peptide in a single rigid conformation. The linker sequence can be used, e.g., 
to space the DNA binding protein from the fused random peptide sequence. 
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Preferably, the orphan protein domain will be between about 1000, preferabh 
500 and more preferably 100 amino acids in length. It is also preferred that the 
orphan protein domain be immobilized on a solid support or surface which is 
conducive to standard affinity panning (i.e. biopanning or panning) techniques 
5 capable of detecting nanomolar binding affinities betw een proteins. A preferred solid 
support is a microtitre dish. 

The term "random peptide" refers to an amino acid oligomer comprising two 
or more amino acid residues that have been constructed by a recognized stochastic or 
random process. A "random peptide library" refers not only to a set of recombinant 
10 DNA vectors that encodes a set of random peptides, but also to the set of random 
encoded by those vectors, as well as the fusion proteins containing those random 
peptides. 

The Protein Interaction Network (PIN) is generally applicable to identifying 
the amino acid sequences which interact with a given orphan protein domain. 

15 A PIN in accord with the invention can be assembled and then stored in a 

variety of ways. For example, a desired PIN can be assembled and stored by use of a 
computer program such as Netscape and particularly a Netscape assisted program. 
The program can be run (i.e. performed) on any suitable computer such as an PC 
(IBM) or Macintosh (Apple) computer. A preferred PIN includes between about 100 

20 to 10 13 , preferably about 1000 to 10 l \ and more preferably about 10 ,: peptide ligand 
sequences. 

Once assembled, the PIN of interest can be further assembled into a Super 
Protein Interaction Network (SPIN) by use of a computer program such as BLAST 
run on, e.g., a conventional central server system. The size of the SPIN will depend 

25 on several parameters such as the complexity of the PIN assembly and desired 
electronic connections with other database networks. In general, the SPIN will 
include between about 5 to 10 s . nreferably 500 to 10\ and more preferably 500 to 10 
PINs. Compilation and analysis of multiple PINs is facilitated by any number of 
stand alone computer-assisted programs particularly BLAST and other secondary 

30 sequence computer programs known in the field. 

The present invention is based on the discovery that a random fusion protein 
library wherein random peptides are fused to the C-terminus of a bacterial DNA 
binding protein such as a transcriptional repressor can be used to select for specific 
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peptide ligands that bind to a given orphan domain. The gene encoding the fusion 
protein is operably linked on a plasmid to the fusion protein's binding site. Following 
expression or induction of the election of the fusion protein in a transformed or 
transfected host cell, the fusion protein binds to its cognate binding sequence on the 
5 plasmid. This linkage of the fusion protein to the plasmid which itself encodes the 
fusion protein allows for repeated rounds of selection for specific peptide ligands in 
the library by affinity purification of fusion protein-plasmid complexes using an 
orphan domain of interest. The plasmid can then be dissociated from the complex and 
used to retransform appropriate host cells for another round of selection. 

10 Generally, the nomenclature used hereafter and the laboratory procedures in 

cell culture, molecular genetics 1 and nucleic acid chemistry and cell culture described 
below are those well known and commonly employed in the art. Standard techniques 
are used for recombinant nucleic acid methods, polynucleotide synthesis, and 
microbial culture and transformation (e.g., electroporation, lipofection). Generally 

15 enzymatic reactions and purification steps are performed according to the 
manufacturer's specifications. The techniques and procedures are generally 
performed according to conventional methods in the art and various general 
references (see, generally. Sambrook et ah. Molecular Cloning: A Laboratory Manual, 
2d ed. (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., which 

20 is incorporated herein by reference) which are provided throughout this document. 
The procedures therein are believed to be well known in the art and are provided for 
the convenience of the reader. All the information contained therein is incorporated 
herein by reference. 

General methods for assembling amino acid and nucleic acid sequence data in 

25 accord with the methods described herein have been disclosed. See S. Altschul et al. 
J. Mol Biol., 215:403-410 (1990); and S. Altschul et al. Nuc. Acids Res., 25:3389- 
3402 (1997) for disclosure relating to the BLAST, particularly gapped BLAST, and 
PSI-BLAST computer programs the disclosures of which are fully incorporated herein 
by reference. 

30 Peptides of the invention comprising those that bind to nNOS are at least 3 

amino acids long and comprise the consensus sequence Asp-X-Val. Peptides of 
longer length are also encompassed within the invention with the proviso that the 
peptide contain the consensus sequence, preferably at the C-terminal end. 
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Accordingly, peptides of at least 5 amino acids, at least ™ amino acids, at least 10 
amino acids and at least 15 or more amino acids are encompassed. 

The peptides of the invention may be prepared by recombinant nucleotide 
expression techniques or by chemical synthesis using standard peptide synthesis 
5 techniques. For example, peptides of the invention can be produced, for example, by 
expressing cloned nucleotide sequences. Alternatively, peptides of the invention can 
be generated directly from intact protein products. Peptides can be specifically 
cleaved by proteolytic enzymes, including, but not limited to. trypsin, chymotrypsin 
or pepsin. Each of these enzymes is specific for the type of peptide bond it attacks. 

10 Trypsin catalyzes the hydrolysis of peptide bonds whose carbonyl group is from a 
basic amino acid, usually arginine or lysine. Pepsin and chymotrypsin catalyze the 
hydrolysis of peptide bonds from aromatic amino acids, particularly tryptophan, 
tyrosine and phenylalanine. Alternate sets of cleaved peptide fragments are generated 
by preventing cleavage at a site which is susceptible to a proteolytic enzyme. For 

15 example, reaction of the epsilon -amino groups of lysine with ethyltrifluorothioacetate 
in mildly basic solution yields a blocked amino acid residue whose adjacent peptide 
bond is no longer susceptible to hydrolysis by trypsin (Goldberger et ah, Biochem., 
1:401 (1962)). 

Peptides of the invention also can be modified to create peptide linkages that 
20 are susceptible to proteolytic enzyme catalyzed hydrolysis. For example, alkylation 
of cysteine residues with beta -halo ethylamines yields peptide linkages that are 
hydroiyzed by trypsin (Lindley, Nature. 178:647 (1956)). In addition, chemical 
reagents that cleave peptide chains at specific residues can be used (Withcop. Adv. 
Protein Chem.. 16:221 (1961)). For example, cyanogen bromide cleaves peptides at 
25 methionine residues (Gross et aL J. Am Chem Soc. 83:1510 (1961)). Thus, by 
treating full-length proteins with various combinations of modifiers, proteolytic 
enzymes and' or chemical reagents, numerous discrete overlapping peptides of varying 
sizes are generated. These peptide fragments can be isolated and purified from such 
digests by chromatographic methods. 
30 Most preferably, isolated peptides of the present inventio n can be synthesized 

using an appropriate solid state synthetic procedure (Steward ai:- YounL Solid Phase 
Peptide Synthesis. Freemantle. San Francisco. Calif. (1968)). A preferred method is 
the Merrifield process (Me-nf eld. Rcc\-:' Progress in Hormone Res.. 23 -^1 (1967)). 
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The binding activity of these peptides may conveniently be tested using, for example, 
the assays as described herein. 

Once an isolated peptide of the invention is obtained, it may be purified by 
standard methods including chromatography (e.g., ion exchange, affinity, and sizing 
5 column chromatography), centnfugation, differential solubility, or by any other 

standard technique for protein purification. For immunoaffinity chromatography, a 
peptide may be isolated by binding it to an affinity column comprising antibodies that 
were raised against that peptide, or a related peptide of the invention, and were affixed 
to a stationary support. Alternatively, affinity tags such as hexa-His (Invitrogen), 

10 Maltose binding domain (New England Biolabs, Inc.), influenza coat sequence 
(Kolodziej et ah. Methods Enzvmol, 194:508-509 (1991 )), and glutathione-S- 
transferase can be attached to the peptides of the invention to allow easy purification 
by passage over an appropriate affinity column. A DNA affinity column using DNA 
containing a sequence encoding the peptides of the invention could be used in 

15 purification. 

Isolated peptides can also be physically characterized using such techniques as 
proteolysis, nuclear magnetic resonance, and x-ray crystallography. 

With regard to nucleic acid sequences of the present invention, "isolated" 
means: an RNA or DNA polymer, portion of genomic nucleic acid, cDNA. or 
20 synthetic nucleic acid which, by virtue of its origin or manipulation: 

(i) is not associated with all of a nucleic acid with which it is associated in 
nature (e.g. is present in a host cell as a portion of an expression vector); or 

(ii) is linked to a nucleic acid or other chemical moiety other than that to 
which it is linked in nature; or 

25 (iii) does not occur in nature. 

By "isolated" it is further meant a nucleic acid sequence: 

(i) amplified in vitro by. for example, polymerase chain reaction (PCR); 

(ii) synthesized by, for example, chemical synthesis; 

(iii) recombinantly produced by cloning; or 

30 (iv) purified, as by cleavage and gel separation. 

The nucleic acid sequences of the present invention may be characterized, 
isolated, synthesized and purified using no more than ordinary skill. See Sambrook et 
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aL Molecular Cloning. Cold Spring Harbor Press. New York. 1 Q S^. incorporated 
herein by reference. 

Due to the degeneracy of nucleotide coding sequences (see Alberts et aL 
Molecular Biology of the Cell. Garland Publishing, New York and London. 1989- 
5 page 103, incorporated herein by reference), a number of different nucleic acid 

sequences may be used in the practice of the present invention. These include, but are 
not limited to. sequences encoding the peptides of Figure 3B and 3C. This includes 
the substitution of different codons encoding the same amino acid residue within the 
sequence, thus producing a silent change. Almost every amino acid except tryptophan 
10 and methionine is represented by several codons. Often the base in the third position 
of a codon is not significant, because those amino acids having 4 different codons 
differ only in the third base. This feature, together with a tendency for similar amino 
acids to be represented by related codons. increases the probability that a single, 
random base change will result in no amino acid substitution or in one involving an 
1 5 amino acid of similar character. 

The nucleotide sequences of the invention can be altered by mutations such as 
substitutions, additions or deletions that provide for functionally equivalent nucleic 
acid sequence. In particular, a given nucleotide sequence can be mutated in vitro or in 
vivo, to create variations in coding regions and/or to form new restriction 
20 endonuclease sites or destroy preexisting ones and thereby to facilitate further in vitro 
modification. Any technique for mutagenesis known in the art can be used including, 
but not limited to, in vitro site-directed mutagenesis (Hutchinson et aL, J. Biol. Chem.. 
253:6551 (1978)), use of TAB Registered TM linkers (Pharmacia), PCR-directed 
mutagenesis, and the like. The functional equivalence of such mutagenized 
25 sequences, as compared with unmutagenized sequences, can be empirically 
determined by comparisons of structural and/or functional characteristics. 

The isolated nucleotide sequences of the invention may be cloned or 
subcloned using any method known in the art (See. for example. Sambrook. J. et aL 
Molecular Cloning, Cold Spring Harbor Press, New York, 1989), the entire contents 
30 of which are incorporated herein by reference. In particular, nucleotide sequences of 
the invention may be cloned into any of a large variety of vectors. Possible vectors 
include, but are not limited to. cosmids, plasmids or modified viruses, although the 
vector system must be compatible with the host cell used. Viral vectors include, but 
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are not limited to, lambda, simian virus, bovine papillomavirus, Epstein-Barr virus, 
and vaccinia virus. Viral vectors also include retroviral vectors, such as 
Amphatrophic Murine Retrovirus (see Miller et al., Biotechniqucs* 7:980-990 (1984)), 
incorporated herein by reference). Plasmids include, but are not limited to, pBR, 
5 PUC. pGEM (Promega). and Bluescript Registered TM (Stratagene) plasmid 

derivatives. Introduction into and expression in host cells is done for example by, 
transformation, transfection, infection, electroporation, etc. 

Examples of DNA vectors for constructing random } .ptide libraries, methods 
of making same, and useful related materials and methods have been disclosed in U.S. 
10 Pat. Nos. 5,270.170 and 5.498,530, the disclosures of which are incorporated herein 
by reference. 

The peptides described herein can be used in pharmaceutical compositions to 
alter the binding of the nNOS PDZ domain and the proteins which this domain 
interacts. The peptides preferably alter the interactions between the nNOS PDZ 

15 domain and melatonin or non-NMDA type glutamate receptors. .An exemplary 
pharmaceutical composition is a therapeutically effective amount of one of the 
disclosed peptides optionally included in a pharmaceutically-acceptable and 
compatible carrier. The term "pharmaceutically-acceptable and compatible carrier" as 
used herein, and described more fully below, refers to one or more compatible solid or 

20 liquid filler diluents or encapsulating substances that are suitable for administration to 
a human or other animal. In the present invention, the term "carrier" thus denotes an 
organic or inorganic ingredient, natural or synthetic, with which the peptides of the 
invention are combined to facilitate administration. 

Peptides of the invention can be stabilized to decrease protease sensitivity 

25 and/or increase in vivo half-life by methods known in the art. For instance, peptides 

of the invention can be modified by the addition of a N or C terminal tail, modified by 
the methylation or glyoxylation of the termini or by substitution or other modification 
to the sequence to increase the peptide half-life, stability, and/or protease resistance. 
In some embodiments, the peptides are conformationally restricted such as 

30 those which are cyclicized, circularized or otherwise restricted by peptide and/or non- 
peptide bonds to limit conformational variation and/or to increase stability and/or 
half-life of the peptides. In some embodiments, peptides are provided as linear 
peptides. 
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In some embodiments, peptides of the present invention comprise one or more 
D amino acids. As used herein, the term "D amnio acid peptides" is meant to refer to 
peptides according to the present invention which comprise at least one and preferably 
a plurality of D amino acids. D amino acid peptides consist of 4-25 ammo acids. D 
5 amino acid peptides retain the biological activity of the peptides of the invention that 
consist of L amino acids, i.e. D amino acid peptides inhibit the interaction of nNOS 
and the proteins which bind to nNOS. In some embodiments, the use of D amino acid 
peptides is desirable as they are less vulnerable to degradation and therefore have a 
longer half life. D amino acid peptides comprising mostly all D amino acids or D 
10 amino acid peptides that consist of only D amino acids may comprise amino acid 
sequences in the reverse order of amino acid sequences of peptides. 

The term "therapeutically-effective amount" is that amount of the present 
pharmaceutical compositions which produces a desired result or exerts a desired 
influence on the particular condition being treated. Various concentrations may be 
15 used in preparing compositions incorporating the same ingredient to provide for 
variations in the age of the patient to be treated, the severity of the condition, the 
duration of the treatment and the mode of administration. 

The term "compatible'* as used herein, means that the components of the 
pharmaceutical compositions are capable of being commingled with the peptides of 
20 the present invention, and with each other, in a manner such that there is no 

interaction that would substantially impair the desired pharmaceutical efficacy. 

Dose of the pharmaceutical compositions of the invention will vary depending 
on the subject and upon particular route of administration used. By way of an 
example only, an overall dose range of from about 1 microgram to about 300 
25 micrograms or 0.1 to 100 mg/kg/day is contemplated for human use. Pharmaceutical 
compositions of the present invention can also be administered to a subject according 
to :\ variety of other, well-characterized protocols. Desired time intervals for delivery 
of multiple doses of a particular composition can be determined by one of ordinary 
skill in the art employing no more than routine experimentation. 
30 The peptides of the inven;:on may also be administered per se (neat) or in the 

form of a pharmaceuticals acceptable salt. When used in medicine, the salts should 
be pharmaceutically acceptable but non-pharmaceutically acceptable salts may 
conveniently be used to prepare pharmaceutically acceptable salts thereof and are not 
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excluded from the scope of this invention. Such pharmaceutical!}/ acceptable salts 
include, but are not limited to, those prepared from the following acids: hydrochloric, 
hydrobromic, sulphuric, nitric, phosphoric, maleic. acetic, salicylic, p-toluene- 
sulfonic. tartaric, citric, methanesulphonie, formic, malonic, succinic, naphthalene-2- 
5 sulfonic, and benzenesulphonie. Also, pharmaceutical^ acceptable salts can be 
prepared as alkaline metal or alkaline earth salts, such as sodium, potassium or 
calcium salts of the carboxylic acid group. 

The compositions include those suitable for oral, rectal, topical, nasal, 
ophthalmic or parenteral administration, all of which may be used as routes of 
10 administration using the materials of the present invention. Other suitable routes of 
administration include intrathecal administration directly into spinal fluid (CSF), 
direct injection onto an arterial surface and intraparenchymal injection directly into 
targeted areas of an organ. Compositions suitable for parenteral administration are 
preferred. The term "parenteral" includes subcutaneous injections, intravenous, 
15 intramuscular, intrasternal injection or infusion techniques. 

The compositions may conveniently be presented in unit dosage form and may 
be prepared by any of the methods well known in the art of pharmacy. All methods 
include the step of bringing the active ingredients of the invention into association 
with a carrier which constitutes one or more accessory ingredients. 
20 Compositions of the present invention suitable for oral administration may be 

presented as discrete units such as capsules, cachets, tablets or lozenges, each 
containing a predetermined amount of the peptides of the invention or as a suspension 
in an aqueous liquor or non-aqueous liquid such as a syrup, an elixir, or an emulsion. 
Preferred compositions suitable for parenteral administration conveniently 
25 comprise a sterile aqueous preparation of peptides of the invention which is preferably 
isotonic with the blood of the recipient. This aqueous preparation may be formulated 
according to known methods using those suitable dispersing or wetting agents and 
suspending agents. The sterile injectable preparation may also be a sterile injectable 
solution or suspension in a non-toxic parenteral ly-acceptable diluent or solvent, for 
30 example as a solution in 1,3-butane diol. Among the acceptable vehicles and solvents 
that may be employed are water, Ringer's solution and isotonic sodium chloride 
solution. In addition, sterile, fixed oils are conventionally employed as a solvent or 
suspending medium. For this purpose any bland fixed oil may be employed including 
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svnthetie mono or digl\ cerides. In addition, fatty acids such as oleic acid find use in 
the preparation of injectables. 

The following non-limiting examples are illustrative of the invention. 
General Comments 

5 The following laboratory procedures were used in the examples below. 

1 . Fusion Protein Expression and Purification 

GST-fusion proteins were expressed in either DH5a or BL21 bacterial strains. 
Cultures with an OD 6 oo of 0.2 were induced for three hours with isopropyl p-D- 
thiogalactopyranoside (IPTG). Bacteria were harvested by centrifiigation and 

10 resuspended in 10 mL of NETN buffer which contains 20 mM 

tris(hydroxymethyl)aminomethane (Tris), pH 8.0. 100 mM NaCl, 1 nM 
ethylenediamine tetraacetic acid (EDTA), 0.5% NP-40, and 2 mM 
phenylmethylsulfonyl fluoride (PMSF). The bacterial cells were lysed by sonication. 
Affinity purification using glutathione-sepharose beads was carried out according to 

1 5 protocols provided by the manufacturer (Pharmacia Biotech Inc., Uppsala, Sweden. 

Fusion proteins can also be prepared using other fusion protein systems known 
in the art including those set forth in U.S. Patents 5,270,1 70 and 5,498,530, both of 
which are herein incorporated by reference. 

2. Library Construction 

20 The random 15-mer library was constructed as described in detail by P. Schatz 

et al., Meth. EnzymoL. 267:171-191 (1996), which is herein incorporated by reference, 
using an oligonucleotide with a degenerate region of 15 codons in the form of NNK. 
where N denotes an equimolar mix of all four bases and K denotes a mix of G or T. 
The library consisted of 1.3 x 10 m independent recombinants. The amplified library 

25 were stored at -80°C in HEK buffer containing 35 mM HEPES pH 7.5. 0.1 mM 
EDTA, and 50 mM KC1. 

Random peptide libraries may also be constructed using other DNA binding 
protein/specific binding site systems such as those disclosed in U.S. Patent Nos. 
5,498.530 and 5,270,170. each of which is herein incorporated by reference. 

30 3. Construction of maltose binding protein fusions 

Nucleotide sequences encoding appropriate peptides were cloned into pELM3 
(P. Schatz et al., Meth. EnzymoL, 267:171-191 (1996)). This allows expression of the 
corresponding maltose binding protein/peptide fusion. The procedure for expression 
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of maltose binding proteins was identical to that for GST fusions except that the LB 
medium was supplemented with 2° o glucose. 
4. Affinity Panning 

A 2 ml aliquot of thawed bacterial cells in HEK was added to 6 ml of lysis 
5 buffer 25 mM HEPES pH 7.5. 0.07 mM EDTA, 8.3% glycerol, 1 .25 mg ml bovine 
serum albumin (BSA), 0.83 mM DTT, 0.2 mM PMSF. The bacteria were lysed for 2 
to 4 mm on ice by the addition of 0. 1 5 ml 10 mg/ml lysozyme (Boehnnger 
Mannheim, Indianapolis. IN) and then 2 ml of 20% lactose and 0.25 ml of 2 M KC1 
were added. The supernatant was obtained after a 15 mm centrifugation at 27,000 x 

10 g. To initiate panning, 12 wells of a 96-well plate were first coated with GST-fusion 
proteins (10 jag protein per well) at 4°C for 1 hour. The wells were then blocked with 
1% BSA in phosphate-saline buffer (PBS) at pH 7.4. After precoating, 250 |il of the 
supernatant was added to each of precoated wells. After gentle agitation for 1 hour at 
4°C, the unbound material was recovered and the wells were then washed with a 

15 series of solutions: 5 times with HEK buffer supplemented with 0.2M lactose and 1% 
BSA. twice with HEK supplemented with 0.2 M lactose, and twice with HEK at 4°C. 
The bound plasmids were eluted with 35 mM HEPES, pH 7.5. 0.1 mM EDTA, 200 
mM KCL 1 mM IPTG for 30 mm at room temperature. The eluted DN A was 
precipitated with isopropanol and amplified by electrotransformation. This pool of 

20 bacterial transformants were used in subsequent rounds of panning. 

The panning procedure was monitored by two parameters: recovery and 
enrichment. Recovery was calculated by subtracting the number of plasmids bound to 
receptor/B SA-coated wells by number of plasmids bound to BSA-coated wells. The 
enrichment at each round of panning was the ratio of recovered plasmids from 

25 receptor coated wells to those recovered from BSA coated wells. The details of one 
affinity panning using PDZ3 of PSD-95 is shown: 



Round No. 


Input 


Output • 


Recovery 


Enrichment 


1 


6.0 x 10" 


1.72 x 10 5 


2.9 x 10" 5 




2 


3.2 x 10° 


1.4 x 10 5 

1 


4.4 x 10" 5 




3 


1.2 x 10 s 


1 1.1 x 10" 


5.9 x 10" 3 


270 


4 


8.4 x 10 7 


4.8 x 10" 

i 


5.7 x 10° 


1.700 
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5 ELISA 

After three to four rounds of affinity panning, individual colonies were 
randomlv selected. Overnight cultures from single colonies were diluted 1 : 1 0 in 3 ml 
of LB ampicillm ( 100 ug ml) and grown 1 hourat37°C. The expression of the Lacl- 
5 peptide fusions was induced by the addition of arabinose to 0.2° o for 3 hours. After 
induction, the cells were pelleted by centrifugation and lysed as described above in 1 
ml of lysis buffer plus lysozyme. The clarified lysates were used immediately for 
ELISA or stored at -70°C. To prepare ELISA, 96-well plates were first coated with 
GST- fusion proteins (0.2 ug protein per well) of nNOS, PSD-95. or disheveled PDZ 

10 domain at 4°C for 1 hour. The wells were then blocked with l°o BSA in phosphate- 
saline buffer (PBS) at pH 7.4. After precoating, the wells were washed three times 
with PBS supplemented with 0.05% Tween-20 (PBT). To initiate the binding, 100 ul 
of 1 : 1 0 diluted lysate was added to each well. After 30 minutes at 4°C, the plate was 
washed four times with PBT. The binding of Lacl-peptide was detected using rabbit 

1 5 anti-Lac I antibody. After 4 washes with PBT, the plate was developed by adding 
alkaline phosphatase-conjugated goat anti-rabbit antibody (GIBCO-BRL, 
Gaithersburg, MD) in PBS/0. 1% BSA (100 u.1 per well for 1 hour at 25°C) followed 
by a 6 mm treatment with p-nitrophenyl phosphate (4 mg/ml) in 1 M diethanolamine 
hydrochloride. pH 9.8/0.24 mM MgCL (200 ul per well). Binding was quantified by 

20 monitoring optical density (O.D.) at 405 nm on an E-max plate reader (Molecular 

Devices inc.. Melno Park, CA). The negative controls were wells coated with control 
GST fusion or as otherwise indicated. All experiments were repeated at least once 
with similar results. 

ELISAs for maltose binding fusion proteins were performed as described 

25 above with a few modifications. 100 fil of a 1 :50 dilution of crude lysate was added 
to each well. All buffers were the same but were supplemented w ith 1 mM maltose to 
minimize oligomerization of maltose binding protein fusions (G. Ricnarme. 
Biochemical and Biophysical Research Communications. 105:476-481 (1982)). 
Interaction of maltose binding protein fusion proteins with immobilized GST-fusion 

30 proteins was monitored by rabbit anti-maltose binding protein antibody (1:1 0.000 
dilution. New England Biolabs. Inc.. Beverly, MA). 
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6. Peptide-PDZ binding 

To determine the affinity of peptide-PDZ interactions, monomeric maltose 
binding protein fusions of peptides were purified by amylose affinity columns 
according to a protocol provided by the manufacturer (New England Biolabs, Inc.. 
5 Beverly, MA). Protein concentration was determined by the Bradford assay (BioRad, 
Richmond, CA) using BSA as standard. The effective concentration, i.e., EC 50 was 
determined by dose dependent ELISA tests. GST fusion was bound at 0.05 fig per 
well. The maltose binding protein fusions were incubated after being serially diluted 
(1:5) starting at 1 5 |iM. The data were fit with the Hill equation 
10 (O.D.405=O.D.405Max''l + {EC 50 /[x]} n). A non-linear least square algorithm was used. 

7. Yeast Two Hybrid Analysis 

Yeast Y187 cells were co-transformed with expression vectors encoding 
various Gal4 DNA binding domain-nNOS fusions and the Gal4 activation domain 
fused to PSD-93 (amino acids 1 16-421). Each transformation mixture was plated on 
1 5 synthetic dextrose plates lacking tryptophan and leucine. Interaction was measured 
by the liquid culture p-galactosidase assay as described (S. Fields et al.. Nature, 
340:245-246 (1989); and Song, 1989; Clonetech. Palo Alto, CA)). Values are 
representative of duplicate experiments. 

8. Fusion Protein Affinity Chromatography 

20 Rat whole brain was homogenized in 10 volumes (w/v) tris-HCl buffer pH 7.4 

and centrifuged at 32,000 x g for 20 minutes. Membranes were solubilized for 2 
hours at 4°C in buffer containing 200 mM NaCl and 1% Triton X-100 and insoluble 
material pelleted by centrifugation at 100,000 x g for 30 minutes. Extracts were 
incubated with control amylose beads or amylose beads saturated with maltose- 

25 binding fusion proteins as indicated. Samples were loaded into disposable columns, 
which were washed with 50 volumes of buffer containing 1% Triton X-100 + 300 
mM NaCl. Retained proteins were eluted with 150 \i\ of loading buffer and were 
resolved by SDS PAGE. Blots were hybridized with a monoclonal antibody to 
nNOS (Transduction Labs, Lexington, KY). 

30 Example 1 - Construction of a random C-termmal random C-terminal 

peptide library 

Peptide binding and x-ray crystjllocraphic studies of PSD-95 indicate that 
specificity of the peptide-PDZ interaction is primarily determined by the final 4 
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residues of the peptide ligand (D. Doyle et aL CclL 85: U>0~- 1 0"7t> ( 199o); E. Kim et 
aL Xature. 378:85-88 (1905); H. Komau et al.. Science. 269: 1 "3"'- 1 740 ( 1995); B. 
Muller et al., Neuron. 17:255-265 ( 1996); M. Niethammer et al.. J. Xcurosci.. 
16:2157-2163 (1996)). To determine optimal peptide binding ligands tor other PDZ 
5 domains, we constructed a fusion protein library' that contains 15 randomized residues 
at the C-terminus. In tr :.s library, a degenerate oligonucleotide encoding the random 
peptides is fused to the end of the E.coli lac repressor (M. Cull et al.. Proc. Xatl. Sci. 
USA, 89:1865-1869 (1992)), which is herein incorporated by reference. Following 
expression 1 the Lac repressor protein binds to the lac operator sequence on the same 
10 plasmid linking each randomized 1 5-mer peptide to the plasmid encoding that peptide 
(Figure 1). This linkage allows repeated rounds of selection for specific peptide 
ligands in the population by affinity purification of peptide-repressor-plasmid 
complexes (see the experimental procedures set forth above). 

In vitro selection of optimal binding peptides for PDZ domains 
15 A random 1 5-mer peptide library using the third PDZ (PDZ3) domain of PSD- 

95 was screened according to the following steps. Step I. A pool of oligonucleotides 
encoding 15 random amino acids (X l5 ) was cloned in frame C-terminal to lac L 
Protein expression from each plasmid of the library yields a Lac I fusion with a 
distinct peptide sequence. The recombinant Lac I binds the lac 0 sites present on the 
20 same plasmid yielding Lac 1-plasmid complexes that are purified from the E.coli. 

Step IL Affinity panning selects peptides that interact with target receptorl e.g., PDZ 
domain. Step III. The bound plasmid DNA can be specifically recovered by addition 
of IPTG. Step IV. The recovered plasmids are retransformed, amplified, and used for 
subsequent rounds of panning. 
25 In PSD-95, PDZ1 and PDZ2 domains interact with the C- terminal four amino 

acids found in Shaker potassium channels and NMDA receptor subunits (H. Kornau 
et al.. Science. 269:1737-1740 (1995): E. Kim et al.. Nature. 378:85-88 (1995)). 
which have a shared consensus of E-(T S)-X-V-COOH. PDZ3 binds to an identical 
sequence (D. Doyle et al., Cell 85:1067-1076 (1996)). A PDZ3 fusion protein was 
30 constructed by linking amino acids 302-402 of PSD-95 to the C-terminus of 

glutathione S-transferase (GST). The purified protein was incubated with a 1 5-mer 
lac I library with a complexity of 1.3 \ 1 0 After 4 rounds of panning selection, a 
1,700-fold enrichment of interacting peptides was achieved (see Experimental 
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procedures). At this stage, individual clones were randomly selected arid subjected to 
ELISA analysis (Figure 2A). 

Briefly, crude bacterial lysates from individual clones (horizontal axis of 
Fieure 2 A) selected through four rounds of panning were prepared (see Experimental 
5 procedures). Association of Lac 1 -peptide fusion with GST-PDZ3 was determined by 
ELISA. Dashed bars indicate wells coated with BSA only; gray bars: GST-NAB H frg 
- BSA; open bars: GST-nNOS-PDZ + BSA; closed bars: GST-PDZ3 - BSA. GST- 
NABhurg is a fusion protein containing amino acids 1-135 from HERG potassium 
channel which has no homology with PDZ domain (X. Li et al., J. Biol. Chem., 

10 272(2):705-708 (1997)), All ELISA experiments in this figure and subsequent figures 
have been repeated at least once with similar results. 

Enriched clones were divided into two classes. One class, such as PD-301. 
PD-302, and PD-304, interacted with both GST control and GST-PDZ3 fusion 
(Fieure 2A), suggesting that the corresponding peptides interact with GST. The other 

1 5 class of clones, including PD-3 1 2, PD-3 1 4, and PD-3 1 5, bound selectively to GST- 
PDZ3. Affinity of interaction (EC 5 o) was 2 to 100 ruM as determined by quantitative 
ELISA as set forth above. 

To determine the binding specificity 1 purified recombinant PDZ fusion 
proteins of nNOS (amino acids 1-150, D. Bredt et ah, Nature, 351:714-718 (1991)) 

20 and disheveled (amino acids 146-226; J. Klingensmith et al. Genes Dew. 8:1 18-130 
(1994)) were also tested for peptide-binding. Under the same conditions, the PDZ3- 
positive clones failed to interact with the PDZ domain of nNOS (Figure 2A) or with 
the PDZ domain of disheveled. Plasmids encoding PDZ3-specific clones were 
sequenced. 

25 An alignment of the deduced amino acid sequences is shown (Figure 2B). 

Indeed, most of the interacting peptides closely resemble the peptide sequence at the 
C-terminus of Shaker-like potassium channels and NMD A receptor subunits, with a 
consensus of E-(T/S)-X-V-COOH. 

Identification of novel peptides interacting with PDZ domain of nNOS 

30 To determine optimal peptide ligands for the nNOS PDZ domain, a 

recombinant GST fusion protein corresponding to the coding sequence of amino acids 
1 to 1 50 of nNOS (nNOS-PDZ) was used for peptide selection. After four rounds of 
panning 1 a 2,300- fold enrichment was achieved. Individual GST-nNOS-PDZ 
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specific clones were identified h\ ELISA { Figure 3 A). It was discovered that ^>5 out 
of 150 clones specifically interacted with nNOS-PDZ but not with the control GST 
fusion protein. Binding affinity of these peptides to immobilized nNOS-PDZ (EC50) 
was 8 to 100 nM. Plasmids from these nNOS specific clones were sequenced. The 
5 deduced amino acid sequences of 95 independent clones were aligned via their C- 
termini (Figures 3B and 3C). 

An analysis of amino acid abundance at each position indicates that valine 
again is strongly preferred (89%) at the 0 position (Figures 4A-4I). At the -1 position, 
there is no obvious preference. Fifteen of the twenty amino acids were found - amino 
10 acids D, E, H, K and N were not present. In contrast to the PDZ3 consensus, aspartate 
at the -2 position w as present in 81% of all nNOS-PDZ binding peptides. At the -3 
position, glycine is significantly preferred. Considering that glycine was used as a 
part of the linker that separates Lac I from the random peptide (Figure 1), this bias 
was appropriately corrected. The corrected glycine abundance is 47% at the -3 
1 5 position. From position -4 to position -8. no obvious amino acid preference was 

observed (Fieures 4A-4I). Based on the amino acid abundance at each position, the 
optimal sequence for a nNOS binding peptide (NBP) is g-D-X-V-COOH. 
SPECIFICITY OF NBP BINDING TO NNOS-PDZ 

Figures 5A- D show that NBP's bind specifically to nNOS PDZ and native 
20 nNOS protein from rat brain. 

The in vitro peptide selection suggests that PDZ3 of PSD-95 and the nNOS- 
PDZ, despite a shared preference for valine at the 0 position 1 have distinct binding 
specificity. To directly test this, we performed ELISA as set forth above and found 
that 36 randomly chosen NBPs failed to bind to PDZ3 of PSD-95 (Figure 5 A) or to 
25 the PDZ domain of disheveled. Based on the peptide-PDZ3 crystal structure (D. 

Doyle et al.. Cell 85:1067-1076 (1996)), it is known that the side-chain of His372 of 
PSD-°5 forms a critical sequence specific hydrogen bond with the T at the -2 position 
of the bound peptide. Interestingly, the amino acid at the corresponding position of 
nNOS-PDZ is Y77, consistent with the idea that substitution of H to Y at this position 
30 converts the -2 position peptide preference from T to D. Also in agreement with this 
notion, the corresponding residue of the disheveled PDZ is N. Amino acid sequence 
comparison of a number of PDZ domains present in Genbank shows that the residue 
after the H or Y is also conserved (nNOS is Y-D, PDZ3 is H-E). To determine 
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whether the Y77 of nNOS is critical we mutated Y77D78 to H77E7S. This mutant. 
nNOS-PDZHE, lost its ability to bind D-X-V peptides and gained the ability to bind 
T-X-V peptides (Figure 5B). 

To evaluate the specificity of the NBP-nNOS interactions, we mutated the D 
5 at the -2 position of the NBP-123 (LDRLRNRVHGDAV-COOH, EC 5 u=40 nM) 

peptide to A, L, Q, R, S, T, and V. Peptides with these amino acid substitutions failed 
to interact with nNOS-PDZ (Figure 5C). To test whether NBPs bind to native nNOS 
protein, we generated an affinity column linking NBP-123 to an agarose matrik (see 
the experimental procedures set forth above). We found that nNOS protein ir -rude 
10 rat brain homogenates adhered to the NBP-123 matrix. In contrast, nNOS did not 

bind to an analogous column in which the -2 D residue of NBP-123 was mutated to T 
(Figure 5D). 

The nNOS-PDZ Domain Has Unique Structural Feature 

Previous studies have shown that the N-terminal domain of nNOS (amino 

1 5 acids 1-150) binds to the PDZ domain of ( 1 -syntrophin and to the second PDZ 
domains of PSD-95 and PSD-93 (J. Brenman et al., Cell 84:757-767 (1996)). 
Although amino acids 16 to 100 of nNOS define the consensus PDZ domain, binding 
studies have shown that fusions containing amino acids 1 to 100 of nNOS do not bind 
to the PDZ domain of either OLl-syntrophin or PSD-93 (J. Brenman et al.. Cell, 

20 84:757-767 (1996)). To test whether the peptide binding property of the nNOS-PDZ 
is confined to the typical consensus, we tested whether any of five randomly selected 
NBPs interact with a fusion protein containing nNOS 1-100. We found that all 5 
NBPs bind to nNOS (1-150) but not to nNOS (1-100). 

To determine the minimal functional structure for nNOS-PDZ to bind NBPs 

25 and PSD-93, we generated a panel of six fusion proteins that express various regions 
of the N-terminus of nNOS (Figure 6). We first evaluated binding of these constructs 
to the PDZ repeats in PSD-93 using the yeast two-hybrid analysis. Binding to PSD- 
93 required amino acids 16-130 of nNOS; truncations on either side of this core 
nNOS- PDZ eliminate the interaction. Similarly, all NBPs required amino acids 16- 

30 130 for binding as tested by ELISA (Figure 6). These studies indicate that the 
functional nNOS-PDZ requires additional amino acids beyond the conserved 
consensus and indicate that both peptide-PDZ and PDZ-PDZ interactions of nNOS 
likely require a similar tertiary structure. 
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Candidate proteins that interact with nNOS 

Identification of the ligand binding consensus of nNOS- PDZ allows an 
electronic search for potential nNOS interacting proteins present in the protein 
databases. A pre-release version of the XREFPatScan software, written in the perl 
5 programming language was used to find all occurrences of the D-X-Y pattern at the 
carboxy-terminus of protein sequences in the non-redundant protein database (nr. 1 1 
Nov 1996) maintained at the National Center for Biotechnology Information 
http://www.ncbi.nlm.nih.gov). This sequence pattern scan has revealed 484 matches 
in the database. Interestingly, this list of potential binding partners includes both 
10 glutamate and melatonin receptors, which are well known to influence nNOS activity. 
See Figures 8A-8R for more detailed results of the PDZ scan of the database. 

Another suitable software package is the SASP package available from GCG 
(Genetics Computer Group, University Research Park, Madison WI). 

In summation, we have employed a powerful genetic strategy to identify C- 
1 5 terminal peptide ligands for the nNOS PDZ domain. This strategy takes advantage of 
the strong protein-DNA association between the lac repressor and the lac operator 
sequence. This interaction is used to obtain a highly complex library of expressed 
peptides each bound to the plasmid that encodes them. By simply panning for peptide 
binding and then sequencing the corresponding plasmids, we were able to rapidly 
20 determine optimal binding partners for the nNOS-PDZ. Identified peptides bind 

potently to nNOS with binding affinities (EC 5 o) in the 8-100 nM range, similar to the 
affinity between the NMDA receptor and PDZ domain of PSD-95 (B. Muller et al., 
Neuron, 17:255-265 (1996)). These peptide sequences are likely to be 
physiologically relevant because a similar panning procedure yielded the known 
25 peptide ligands for PDZ3 of PSD-95. 

The consensus peptide binding sequence for the nNOS-PDZ is D-X-V, which 
contrasts with the E-(T/S)-x-V found for PDZs of PSD-95 (D. Doyle et al.. Cell, 
85:1067-1076 (1996); E. Kim et al.. Nature. 378:85-88 ( 1995); H. Kornau et al.. 
Science, 269:1737-1740 (1995); B. Muller et al.. Neuron, 17:255-265 (1996); M. 
30 Niethammer et al., J. NeuroscL. 16:2157-2163 (1996)). Analysis of the crystal 

structure of peptide-bound PDZ3 suggests rational explanations for these alternate 
specificity (D. Doyle et al.. Cell, 85:1067-1076 (1996)). Similar preference of the two 
domains for terminal valine is expected because the critical residues in the 
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carboxylate binding loop of PDZ?* including the GLGF tetrapeptide, arc precisely 
conserved in nNOS-PDZ. While the carboxylate loop of PSD-95 binds most potently 
to peptides with C-terminal valine, other terminal hydrophobic amino acids are 
permitted. Such degeneracy was also found in some nNOS binding peptides, e.g., 
5 NBP-14 (Figures 3B and 3C). Inwardly rectifying potassium channel subunits of 
class 2.0 terminate with S-X-I and these channels also bind to PSD-95. In addition 
the -2 serine of Kir 2.3 serves as a potent substrate for protein kinase A and this 
phosphorylation event regulates binding of the channel to PSD-95 (N. Cohen Xeuron, 
17:759-767 (1996)). 

10 Specificity of PDZ3 for T/S at the peptide -2 position is mediated by hydrogen 

bonding of the hydroxyl of the T/S with the N-3 nitrogen of H372 of PDZ3 (D. Doyle 
et al.. Cell. 85:1067-1076 (1996)). The corresponding residue in nNOS is Y77. The 
greater electrophilic character of Y compared to H may explain the preference of the 
nNOS PDZ for the acidic amino acid D at peptide position -2. Accordingly, mutation 

15 of Y77D78 of nNOS to H77E78 changes the binding specificity from DXV to TXV. 
Interesting, the Y77 position is not generally conserved in other orphan PDZ domains 
and this single residue may allow for much of the diverse peptide ligand specificity at 
the -2 position. 

These studies emphasize that the nNOS PDZ domain has unique structural 
20 features. The consensus PDZ domain contains 80 amino acids, and PDZ3 of PSD-95 
was functionally active as a 101 amino acid polypeptide (D. Doyle et aL CelL 
85: 1 067- 1 076 ( 1 996)). By contrast, a functional nNOS PDZ domain requires an 
additional 30 ^mino acids C-terminal to the identified consensus. We wondered 
whether the smaller nNOS constructs, such as nNOS 1-100, were inactive due to a 
25 non-specific problem with polypeptide folding. However, circular dichroism (CD) 
analysis indicated a predicted high degree of secondary structure for nNOS 1-100 
consisting of ~X% of a-helix and -Y% p-strand. This is similar to the composition 
of a-helix and p-strand found in PDZ3 structure of PSD-95. Furthermore nNOS 1- 
100 showed thermal stability to 42°C which is comparable to the thermal stability of a 
30 functionally active PDZ domain of FAP. Therefore, we believe that the functional 
nNOS PDZ has a structure somewhat larger than that of other PDZ domains. By 
using our genetic peptide selection strategy, it will be possible to determine whether 
other PDZ domains are also larger than the presently identified consensus. See K. 
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Christophcrson et aL J. Clin. Invest.. 100:2424-2429 < 199"); and N Strieker et al.. 
Nat. BiotcchnoL. 15:336-342 (1997). the disclosures of which are hereby incorporated 
by reference. 

In addition to interacting with peptide ligands. the PDZ domain of nNOS 

5 associates with other PDZ domains, including the PDZ domain of ( 1 -syntrophin and 
the second PDZ of PSD-95 and PSD-93. Three dimensional structure of a PDZ PDZ 
heterodimer is not yet available, but our data suggest the PDZ PDZ binding interface 
overlaps with the peptide recognition sequences. Thus, deletions of nNOS PDZ that 
abolish peptide binding also eliminate binding to (1-syntrophin and PSD-93. 

10 Crystallography of PDZ3 of dig showed that the PDZ domain forms a dimer in which 
the surface of the peptide-binding domain of one PDZ subunit interacts with residues 
in (-strands from the other subunit (J. Cabral et al.. Nature, 382:649-652 (1996)). 
This binding topology of PDZ domains may explain why the SXV peptide of the 
NMDA receptor 2B potently blocks nNOS binding to PSD-95 (J. Brenrnan et al., 

15 Cell 84:757-767 (1996)). Proteins containing the DXV nNOS interacting domain 
may also disrupt interaction of nNOS with PDZ proteins. This may explain the 
paradoxical situation that (1-syntrophin, but not nNOS, is present at the sarcolemma in 
patients with Becker muscular dystrophy (D. Chao et al.. Journal of Experimental 
Medicine, 184:609-618 (1996)). Perhaps, in the myofibers of these patients, the 

20 nNOS PDZ is occupied by a protein with a C-terminal D-X-V and is unable to bind to 
OLl-syntrophin. 

The disclosed genetic selection strategy will help identify peptide Hgands for 
the 100s of orphan PDZ domains that have been sequenced. After isolating high 
affinity peptides, protein data base analysis may suggest candidate physiological 

25 binding partners. Our search with the terminal DXV consensus for nNOS yielded 
several attractive candidates including melatonin receptor la (U 14 108) and an 
alternatively spliced form of GluR6 (X661 17). Though nNOS is best activated by 
calcium influx through NMDA receptors (J. Garthwaite et al.. Nature. 336:385-388 
(1988)), there is also abundant literature showing that nNOS activity can be regulated 

30 by melatonin (D. Vesely, Mol Cell Biochem., 35:55-58 (1981)) and by non-NMDA 
type glutamate receptors (J. Garthwaite et al., Annu. Rev. Physiol.. 57:683-706 
(1995)). Our data suggest that physical association of nNOS with GluR6 and with 
melatonin receptors may participate m this functional coupling. 
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The invention has been described with reference to preferred embodiments 
thereof. However, it will be appreciated that those skilled m the an, upon 
consideration of this disclosure, may make modifications and improvements within 
the spirit and scope of the invention as set forth in the following claims. 
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What is claimed is: 

1 . A peptide of at least 3 amino acids comprising the sequence D-\-\ - 
COOH wherein D^Aspartic acid. X=any amino acid and Y=Yaline. 

2. An isolated nucleic acid encoding the peptide of claim 1. 

3. A method for determining the identity of proteins which interact with a 
protein binding domain (orphan protein domain) of a first protein (Protein Interaction 
Network (PIN)) comprising: 

screening a random peptide library comprising transformed host cells, each ot 
which contains a plasmid that comprises a lacO binding site and encodes a fusion 
protein comprising a Lac repressor DNA binding protein fused to a peptide, wherein 
each transformed host cell differs from one another with respect to the peptide in said 
fusion protein, said screening comprising lysing the host cells under conditions that 
the fusion protein remains bound to the plasmid at the lacO binding site, contacting 
the fusion proteins of the random peptide library with a protein binding domain 
(orphan protein domain) under conditions conducive to specific peptide-protein 
binding domain (orphan protein domain) binding; 

isolating the plasmid that encodes a peptide that binds to the protein binding 
domain (orphan protein domain); 

sequencing the plasmid to obtain the sequence of the peptide that binds to the 
protein binding domain (orphan protein domain); and 

searching the available nucleic acid and protein sequence databases to identity 
proteins which comprise the sequence of the peptide which binds to the protein 
binding domain (orphan protein domain) 

4. The method of claim 3, further comprising the step of: assembling the 
PINS from different orphan protein domains into an electronic databank that can be 
searched with a the sequence of a protein domain (orphan protein domain) of interest. 

5. A method of treating a neurodegenerative disease, motility disorder or 
muscular dystrophy in a human or animal comprising administering to a patient in 
need thereof an effective amount of the peptide of claim 1. 

6. The peptide of claim 1. wherein said peptide comprises at least 5 
ammo acids. 

7. The peptide of claim K wherein said peptide comprises at least 10 
amino acids. 
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8. The peptide of claim 1. wherein said peptide comprises at least 15 
amino acids. 

9. A peptide ligand detection system comprising: 

a) a random peptide library comprising a recombinant DNA vector 
encoding a DNA binding protein that specifically binds a DNA sequence on the 
vector, the DNA binding protein comprising a covalently linked sequence encoding a 
random peptide sufficient for the vector to encode at least about 10 6 different fusion 
proteins each of which is capable of specifically binding the DNA sequence on the 
vector; and 

b) an orphan protein domain sequence immobilized on a solid support 
capable of specifically binding the random peptide of the DNA binding protein. 

10. The peptide ligand detection system of claim 9 further comprising an 
inducer molecule capable of specifically binding the DNA binding protein sufficient 
to release the recombinant DNA vector from the immobilized orphan protein domain 
sequence. 

1 1 . The peptide ligand detection system of claim 9 wherein the DNA 
binding protein comprises a prokaryotic repressor protein sequence and the DNA 
sequence bound by the DNA binding protein is a prokaryotic operator sequence. 

12. The peptide ligand detection system of claim 1 1 wherein the 
prokaryotic repressor protein sequence is a lac repressor or a fragment thereof capable 
of specifically binding the DNA sequence on the vector. 

13. The peptide ligand detection system of claim 1 1 wherein the 
prokaryotic operator sequence is lac O or a fragment thereof capable of being 
specifically bound by the prokaryotic repressor protein sequence. 

14. The peptide ligand detection system of claim 10 wherein the inducer 
molecule is isopropylthio-P-D-galactoside (IPTG). 

15. The peptide ligand detection system of claim 1 1 wherein the 
prokaryotic repressor protein sequence and the random peptide sequence are spaced 
bv a peptide linker sequence encoded by nucleic acid sequence comprising -G-G-G-. 

16. A peptide ligand detected by the ligand detection system of claim 1 
having a binding affinity (EC50) for the orphan protein domain of between about 0.5 
to 500 nM. 
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17. A peptide ligand comprising between about 3 and 50 amino acids 
comprising an amino acid sequence consisting of D-X-V-COOH. wherein the peptide 
lisand has a binding affinity (EC>n) for an orphan protein domain of between about 
0.5 to 500 nM. 

18. The peptide ligand of claim 17, wherein the orphan protein domain is a 
PDZ domain. 

19. The peptide ligand of claim 18. wherein the PDZ domain is obtained 
from a protein selected from the group consisting of nitric oxide synthase (nNOS), 
post-synaptic density protein (PSD-95 SAP-90), post-synaptic density protein (PSD- 
93), epithelial tight-junction protein zona occludens-1 (ZOl), N-methyl-D-aspartate 
(NMD A) type glutamate receptor, Shaker-type potassium channel subunit, and 1- 
syntrophin. 

20. A therapeutic composition comprising the peptide ligand of claim 18. 

21. An isolated nucleic acid encoding the peptide ligand of claim 18. 

22. A DNA vector comprising the isolated nucleic acid of claim 21. 

23. A method of detecting a peptide ligand capable of specifically binding 
an orphan protein domain of a protein, the method comprising: 

a) lysing transformed cells comprising a random peptide library comprising a 
recombinant DNA vector encoding a DNA binding protein that specifically binds a 
DNA sequence on the vector, the DNA binding protein comprising a covalently 
linked sequence encoding a random peptide sufficient for the vector to encode at least 
10 6 different fusion proteins each of which is capable of specifically binding the DNA 
sequence on the vector, wherein the lysing is under conditions such that the DNA 
binding protein comprising the random peptide remains bound to the recombinant 
DNA vector, 

b) contacting the fusion proteins of the random peptide library to an 
immobilized orphan protein domain under conditions conducive to specific peptide- 
orphan protein domain binding; and 

c) isolating a recombinant DNA vector encoding a fusion protein that 
specifically binds to the orphan protein domain. 

24. The methoc of claim 23 further comprising the steps of transforming a 
host ceii with the recombinant DNA vector obtained in step c). repeating steps a), b), 
and c) with the host cell and isolating a selected recombinant DNA vector. 
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25. The method of claim 24 further comprising determining the ammo acid 
sequence of the random peptide encoded by the selected recombinant DNA vector. 

26. The method of claim 25 further comprising searching a protein 
sequence database to identity' an orphan protein domain in the database comprising 
the random 

peptide. 

27. The method of claim 26 further comprising assembling a protein 
interaction network (PIN) sufficient to correlate a plurality of random peptide 
sequences to the orphan protein domain. 

28. The method of claim 27 further comprising assembling a super protein 
interaction network (SPINS) comprising a plurality of protein interaction networks 
(PINs) sufficient to serve as an electronic extension database for the protein sequence 
database. 

29. The method of claim 26 wherein the orphan protein domain in the 
database is any one of the orphan protein domains (protein modules) shown in Figure 
7. 

30. A method of detecting a peptide ligand capable of specifically binding 
an orphan protein domain of interest, the method comprising searching a super protein 
interaction network (SPINS) with an amino acid sequence comprising an orphan 
protein domain of interest, and identifying the peptide ligand capable of specifically 
binding the orphan protein domain of interest. 

3 1 . The method of claim 30. wherein the peptide ligand is obtained from a 
random peptide library- 
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CLONE NO. 



SEQUENCE 



Library GGGXXXXXXXXXXXXXXX* 

PD - 2 0 5 GGGMFVGDQIHDLRLETSV* 

PD-210 GGGMATSRPSGARRTTSV* 

PD- 211 GGGMSG WFHD WL GRETTV* 

PD-212 GGGMFVGDQVDLRLETSV* 

PD-215 GGGILIVRNLETSV* 

PD- 3 0 3 GGGRSLIGAVEKRQETSV* 

PD - 3 0 7 GGGQETLRRLSVGPETSV* 

PD-312 GGGHRRSARYLESSV* 

PD - 3 1 4 GGGREASNKVRLRKES TV* 

PD-315 GGGGPESLLWKVRRETSL* 

PD-3 25 GGGRIELHGVLKGCETAV* 



FIG. 2B 
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CLONE NO. 

Lit rary 

NBP-4 

NBP-5 

NBF-7 

NBP-8 

NBP- 9 

NBP-10 

NBP-11 

NBP-12 

NBP-14 

NBP-15 

NBP-17 

NBP-18 

NBP-24 

NBP- 27 

NBP- 2 8 

NBP- 32 

NBP- 3 3 

NBP-34 

NBP-35 

NBP- 3 6 

NBP- 37 

NBP-41 

NBP-42 

NBP-44 

NBP-45 

NBP-47 

NBP-4 8 

NBP-49 

NBP-52 

NBP-53 

NBP-54 

NBP- 55 

NBP- 56 

NBP-57 

NBP - 5 9 

NBP-6G 

NBP- 61 

NBP-62 

NBP-64 

NBP- 65 

NBP-66 

NBP-67 

NBP-68 

NBP-69 

NBP-70 

NBP-71 

NBP-72 

NBP-73 

NBP-74 

NEP-76 

NBP-77 

NBP-78 

NBP-79 

NEP-51 

NEP-82 



SEQUENCE 

g g g X X X X X X X X X X X X X XX * 

GGGGT? 2 KAVH XD WG YS V * 

GGGI EAGGD PV * 

GGGZPV* 

. GGGD ART K I WNF_AAL LI* 
GGGAQG RWPQF C V Y P LAV * 

GGGVKYFGDSV* 

GGGVLGDLV* 

GGGAMEVTLLSHQFGDPV* 

GGGDAI * 

GGGWAGYGRGMAVSGDMV* 
GGGFPFF MGTMGEYGIQV* 
GGGLGKDYPSAPDNGDLV* 
GGGI YGMMR I GTGLVDVL * 
GGGAGQDKQAGQKWGDLV* 

GGGGVDWV* 

GGGDAV* 

GGGRWDWV* 

GGGKGHIAITSDGVGDLL* 
GGGNYDRVGLLRGPVDFL* 
GGGKRPDGVLFQRPGDLV* 

GGGDAW* 

GGGDPV* 

GGGGDAV* 

GGGGLARLNLS S YYGDAV * 

GGGVDWV* 

GGGRVIGSPNPSRSADIV* 

GGGDWV* 

GGGS FMNB P VAGTAGDS V * 

GGGS RGDMV * 

GGGDWJ* 

GGGDGKLLRRPC.'LRWI FC* 
. GGGKRD E T G FNMWGN AV * 

GGGWQGDPV* 

GGGALGD PV * 

GGGDPV* 

GGGGDLV* 

. . GGGESGSGVP.TWGVPV* 
. . . GGGRVQLVRGGVDCV* 

GGGDAV* 

. GGGWRWKSVMRWPDPV* 

GGGD~LV* 

. . GGGS KS C GR V I LGD IV* 

GGGVDWV* 

GGGI IQGQARGTRWGEMV* 

GGGDAV * 

GGGGGWPELXFXLLGVPI * 
GGGRCMLNLVTGRWAT TV * 
GGGGMGQTLEELTTGDWV* 
GGGDRGWAVGWGLRGVPV* 

GGGGPARYGDSV* 

GGGDLV* 

GGGF S S LVLGAGDLGVAP * 
. GGGMQWWAQRLLAGDCV* 
G G GKL G G .-i Q G AI 7 r F G D A V * 
GGGTWGRAV* 
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CLONE NO. SEQUENCE 

NBP-83 GGGLKSTGSEVNSLGDW* 

NBP-84 GGGSE AT AVWTSKWSDLV* 

NBP-85 GGGFVSSVRYSGVAGDQV* 

NBP-86 GGGLWSDAV* 

NBP-87 GGGRVTGRSSYLGMGDIV* 

NBP-8 8 GGGDMV* 

NBP- 8 9 GGGKFS VRHTLVS AGD P V * 

NBP- 91 GGGARGQLPATRCKAFLC* 

NBP-92 GGGYEEGVAV* 

NBP-93 GGGDRV* 

NBP-94 GGGDLV* 

NBP-95 GGGVRGALTRGMT PGD P V * 

NBP-96 GGGDLV* 

NBP-102 . . . . GGGVAGVGKYGDLV* 

NBP-103 GGGDLV* 

NBP-107 GGGDVI* 

NBP-108 GGGKMRV GVDAV * 

NBP-1I1 GGGDPV* 

NBP-112 . . . . GGGRDSERLMGI PV* 

NBP-113 GGGDQV* 

NBP- 114 GGGRWSEGDGV* 

NBP-117 GGGLGRGSVRPGRRPDIV* 

NBP-1I8 GGGDW* 

NBP- 119 GGGI KRLD I YMRN I GDLV * 

NB P - 1 2 2 GGGS ATAWNGD P V * 

NBP- 12 3 . . GGGLDRLRNRVHGDAV * 

NBP-12 4 GGGREVSVCHRPDAGDAV* 

NEP-12 5 GGGSRVPRNTSIFWGNAV* 

NBP- 12 t GGGDCGNVTHAILWGDAV* 

NB p - 1 2 £ GGGI CALG A I YVMGG VDAV * 

NBP-129 GGGWGSPV* 

NBP- 131 GGGrCGS PSLVGPVWADAV* 

NBP- 13 3 GGGILNPVPRNLSEGDYV* 

NBP- 13 6 GGGDQV* 

NBP- 13 7 GGGGERLNRS ATAGADLV * 

NBP- 13 6 GGGEGGRNPDIV* 

NBP -14 0 GGGNQRYWNPF I WGQS V * 

NBP-142 . . . . GGGDS INLSWPVAV* 

NB p - 1 4 3 GGGCMLQVRK I Y3? CDAV * 

NBP-161 . . . . GGGVIGKSCYGDAV* 



FIG. 3C 
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1. Endoplasmic reticulum targeting sequence 

2. Microbodies C-terminal targeting signal 

3. Gram-positive cocci surface proteins anchoring hexapeptide 

4. Bipartite nuclear targeting sequence 

5. Cell attachment sequence 

6. ATP / GTP- binding site motif A (P-loop) 

7. Cyclic nucleotide-binding domain signatures 

8. EF-hand calcium-binding domain 

9. Actinin-type actin-binding domain signatures 

10. Anaphylatoxin domain signature and profile 

1 1 . Apple domain 

12. Band 4.1 family domain signatures 

13. Clq domain signature 

14. C-terminal cystine knot- signature and profile 

15. CUB domain profile 

16. Death domain profile 

17. EGF-like domain signatures 

18. Calcium-binding EGF-like domain signature 

19. Forkhead-associated (FHA) domain profile 

20. Fibrinogen beta and gamma chains C-terminal domain signature 

21. Type II fibronectin collagen-binding domain 

22. Hemopexin domain signature 

23. Kringle domain signture 

24. LDL-receptor class A(LDL RA) domain signature 

25. C-type lectin domain signature 

26. Osteonectin domain signatures 

27. Somatomedin B domain signature 

28. Thyroglobulin type-1 repeat signature 

29. P-type ("Trefoil") domain signature 

30. Cellulose -binding domain, bacterial type 

31. Cellulose-binding domain, fungal type 

32. Chitin recognition or binding domain signature 

33. Barwin domain signatures 

34. WAP- type four-disulfide core' domain signature 

35. Phorbol esters /diacylglycerol binding domain 

36. C2 domain signature and profile 

37. CAP-Gly domain signature 

38. Ly-6/u-PAR domain signature 

39. MAM domain signature 

40. PH domain profile 

41. Phosphotyrosine interaction domain (PLD) profile 

42. Src homology 2 (SH2) domain profile 

43. Src homology 3 (SH3) domain profile 

44. VWFC domam signature 

45 WW/rsp5/WWP domain signature and profile 

46. ZP domain signature FIG. 7 

47. S -layer homology domain signature 
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ResulLs: PDZ scan (D-X-V) vs. non-redundant protein database 



>gilS6V07ipir.LOii?9 12fK fusion prctein - human g:^37G3-i (M21610) rLN'A polymerase n [Homo 
saoieas] (Match DLV) 

>giil8l745 MHC class H eel! surface ~rzi±:r. iHcmc sacier.sl (Match DTV) 

>r.i5-:7:2^.rjcc39'SCD?_HU\L^V 7,S-DCH?fDRC-3-0;<COUA>TNE TRIPHOSPHATASE (S-OXO- 
DGTPASc;. £L5427-9 [ c:r:lA-SSSc* 3-cxc-7.8-iLaydrcrjanosine srichcchatase - human ^1^52539 
(D16531) a-oxc-iGT?iS£ [Hcmo samccs] g:Il^C53fC (D33f9-*) S-oxc-dCTPase [He mo sacier.s] (Match 
DTV) 

>gil 177776 0*1363-^1) sersccnir. rtcectcr [Hcmo sacier.s' (Match DGV> 

>r-ii u9^7iszi?:£279;3GAj,[.:-rj>LA_v beta ga-lactosidase-relate d protein 

PRECURSOR. p::Of-:-:c::;iE:2d53 creia-gaiactcsicaj-e-reiatec crcte:n - human gii 179-2 I (M27508) 

beta-gaJac:cs:cis- related ;rc:e:r. crscurscr [Hcmo samer.sj (Match DKV) 

>giilS -273 (M3-liC) lysyL oxicase [Hcmo sacier.sl (Match DLV) 

>gi!553572 (MS2837) NC-IC class EE HLA-DQ-airha- 1 [Home sacierj;! (Match DPO 

>giil U9^spi?l 62783 C^Al^IUNLAN 3 ETA - G ALA.CTO S ID AS E PRECURSOR 

(LACTAScj. g:i36?33:?ir!A22£ll beia-galzc:cs:dase (EC 3.2.1-23) crecuxscr - human ei!179401 

(M275G7) beu-D-gaiactcsicase precursor (EC 3.2.1.23} [Heme sacier.s] gi!179423 (M34423) beta- 

gziacrcsidase precursor (EC 3.2.1.23) [Hcmo sacier.s ; (March DKV) 

>gi!1794l9 (M225SC) sea- zalactcsidase precursor CEC 3.2. 1.23) [Hcmo sacier.s] (Match DKV) 
>gili3l759 (M63195) DR3 1 cransc iantauon anagec [Hcmo sapiens] (Match DTV) 
>gill24462!sciPI"l3 IIINTU.KUNLaN INTHRFERON-ALPK.VBETA RECEPTOR ALPHA CHAIN 
PRECURSOR (IF>'-A2L?M^-?^C). g:ll06790lcir:lA32694 interferon alpha recsocor precursor - 
human gii3C69!-i (;C317l) interfercn-aloha receoccr crecursor [Hcmo 

sapiensj g ;i 1 *67385igrJiPrDieZ5 L62S (A32391) chimeric IFNalpha/bea- receptor [Homo sapiens] (Match 
DFV) 

>gil30972 (2M2C6) Ig heavy chain variable region (VDJ) [Homo sapiens] (Match DMV) 

>gi!32672 (X6C^5r) Human EFNAR zzr.z for interferon aicha/beia recertcr (Hcmo saoier.s] (Match DFV) 

>giil25472!scfPi07" L i K KIT _ KL~\ LAN* MAST/STEM CELL GROWTH FACTOR RECEPTOR 

PRECURSOR (SCFR) (PP.OTO-ONCOGENE TYR C S INE * P ROTEIN KINASE KIT) (C-KXT) 

(CDi 17). gii€c3Ulpir:iTV"HUKT prcce:n-cyrosine kinase (EC 2.7. LI 12) kit crecursor - human eil34085 

(X061S2) prote-.n pl45-c;<u: (A- A t - 976) [Hcmo sacier-s] g!i32i636 (X69301) mas:/stem cell growth 

factor rcceptcr [Homo sacier.s] (Match DDV) 

>gil34992 fXITloi) Beta l-sucunit of Na(-).K(-r)-AT?ase [Homo saotens] (Match DRV) 
>gil63l336tpirllS425c3 POU domain prctein - human g:i4373G9 (221963) POU domain protein [Homo 
sapiens] (Match D W) 

>gil437311 (221964) ?OU domain prctein (Homo saotens] (Match DW) 
>giU37SI3 (221965) POU domain protein [Homo sactens] (Match DW) 
>giIU709SlspiPuu67^;COXA_HUMA-V CYTOCHROME C OXIDASE POLYPEPTIDE Va 
PRECURSOR. gi!6c276ipir:!OT:™:U5A cytcchrome-c oxidase (EC 1.9.3.1) chain Va precursor - 
human giI695360 (M22760) cytochrome c oxidase subunit Va (Homo saciens] (Match DKV) 
"-Net numar.-->-/5 35~G9isc!C045^;?OLN_SOUV3 NONSTRUCTURAL FOLYPROTEIN 
(CONTAINS: P^VA-L IP^CTED RNA POLYMERASE . THICL PRCTEASE . KELICA3E (2C LIKE 
PROicLN^). giU75733ipiri!A37-:9L orfl putative heiicas&'polymerase coiyprotem • Southampton 
virus gii^36^prf.IlSC6-i03 rheumatoid factor VH [Hcmo sapiens] (Match DGV) 

Figure 8A 
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>«iill346544lsplP4S039IMLlA_HUMAN MELATONIN RECEPTOR TYPE LA 
(MEL-1A-R). gi!602 130 (L' 14 10S) Mel- la mclaionin recepior [Homo sapiens] 
(Match DSV) 

>2il726255 (U22228) aggrecan [Homo sapiens] (Match DFV) 

>gil793763 (D26512) MT-MMP [Homo sapiens] (Match DKVi 

>2il804994 (XS3535) MT-MMP [Homo sapiens] (Match DKV) 

>sil963054 (Z48481) membrane-type matrix metalloproteinasc 1 [Homo sapiens] 

cill 127837 (U41078) membrane-type matrix metalloproteinase- 1 [Homo sapiens] 

(Match DKV) 

>ei!976297 (L37839) This CDS feature is included to show the translation of the 
corresponding V_segment. Presently translation qualifiers on V_segment features 
are illegal. [Homo sapiens] (Match DAV) 

>2ill 24746 UgnllPIDIe200676 (A26595) interferon beta receptor [Homo sapiens] 
(Match DFvT 

>sil 1262584 (D90161) leader sequence. L' [Homo sapiens] (Match DPV) 
>2ill495995lgnllPIDIe 196537 (X90925) MT-MMP protein [Homo sapiens] (Match 
DKV) 



Mouse 

>gil244607lbbsl79586 cleaved prolactin- 1. clPRL- 1 =fragment A [rats. Peptide 
Partial. 20 aa] (Match DRV) 

>gil497021 (U05699) cytochrome c oxidase subunit Va [Mus spretus] (Match 
DKV) 

>cil5Q5029 (D 14849) meiosis-specific nuclear structural protein 1 [Mus musculus] 
(Match DGV) 

>gil531881 (U 12877) vascular cell adhesion molecule- 1 [Mus musculus] (Match 
DTV) 

>gill91913 (Ml 1895) A-l alpha-amylase [Mus musculus] (Match DKV) 

>gill91919 (Ml 1896) B-l alpha-amylase [Mus musculus] (Match DKV) 

>gil 192098 (Ml 8 187) B144 protein A [Mus musculus] (Match DYV) 

>gil 196056 (M34984) Ig H-chain [Mus musculus] (Match DTV) 

>2il554244 (K03547) myb protein [Mus musculus] (Match DSV) 

>eil 1 363 194lpirllA53202 MAMA protein precursor - mouse gil297()33 (X67809) 

mama gene product [Mus musculus] (Match DMV) 

>2il423447lpirllS35792 glutamate receptor GluR6C - mouse gi!3 12494 ( X661 17) 
sfutamate receptor subunit GluR6C [Mus musculus] (Match DTV) 
>2ill 17099lsplP12787ICOXA_MOUSE CYTOCHROME C OXIDASE 
POLYPEPTIDE VA PRECURSOR. gi |c >O42()lpirl!S05495 cytochrome-c oxidase 
(EC 1.9.3.1 ) chain Va precursor - mouse gil5()52~ tX15963) cytochrome c oxidase 
subunit Ya preprotein [Mus musculus] i Match DK\ ) 
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>gilSQ5000 (XS3536) MT-MMP [Mus musculus; (Match DKY) 
>nil939 c ">5 1 (X73037) partial paired box; pid:e~49S5 [Mus musculus] (Match DGVi 
>tiill 1S4S77 (U4t>562) MHC class II transactivator CIITA [Mus musculus] (Match 
DMV ) 

>2il 1215666 (Ul~ , 267) T cell receptor-Zeta [Mus musculus] (Match DEY i 

>cil 1326151 (U52222) Mel- la melatonin receptor [Mus musculus] (Match DSY) 



Rat 

>sil666942 (M22615) cholesterol side-chain cleavage enzyme [Rattus norvegicus] 
(Match DTV) 

>gill 124371pirlIS20612 triacylglycerol lipase (EC 3.1.1.3) - rat gil56600 (X61925) 
triacylslvcerol lipase [Rattus norvegicus] (Match DTY) 
>siu'l7262lsplP14137ICPMl_RAT"CYTOCHROME P450 X1A1. 
MITOCHONDRIAL PRECURSOR (P450(SCO) (CHOLESTEROL SIDE- 
CHAIN CLEAYAGE ENZYME) (CHOLESTEROL DESMOLASE). 
gil92074lpirllA34164 cholesterol monooxygenase < side-chain-cleaving) (EC 
L 14. 15.6) cytochrome P450 1 1A1 - rat gil203561 iM63133) cytochrome P-450-scc 
[Rattus norvegicus] gil203639 (J05156) cholesterol side-chain cleavage enzyme 
precursor (EC 1.14.15.6) [Rattus norvegicus] (Match DTV) 
>gil204101 (K01336) beta-fibrinogen [Rattus norvegicus] (Match DKV) 
>gil206148 (Ml 6960) calcium-calmodulin-dcpendent protein kinase II [Rattus 
norvesicus] (Match DGV) 

>gill f7 1 OOlspIP 1 1240ICOXA_RAT CYTOCHROME C OXIDASE 

POLYPEPTIDE VA PRECURSOR. gil92182lpirllS()4592 cytochrome-c oxidase 

(EC 1.9.3.1 ) chain Ya precursor - rat gil55971 (X15030) cytochrome c oxidase 

subunit Va preprotein [Rattus norvegicus] (Match DKV ) 

>sil682650 (L191 18) complement receptor type 1 [Rattus norvegicus] (Match 

DQV) 

>gil805013 (X83537) MT-MMP [Rattus norvegicus] (Match DKV) 
>gill001927 (X91785) membrane-type metalloproieinase [Rattus norvegicus] 
(Match DKV) 
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>2ill33429filgnllPIDIe 10391 iX03914) imerleukin-3 ua 102-1 15) [Rattus 
norvegicus] (Match DSV) 

D. melanogaster 

>gil461S52lsplP35220ICTNA_DROME ALPHA-CATENIN. gil422436lpirllA40694 
cadherin-associatcd protein D alpha-catenin - fruit fly (Drosophila melanogaster) 
gil285752 (D13964) alpha-catenin [Drosophila melanogaster] (Match DAV) 
>2il259790lbbsll 17942 (S48157) DNA polymcrase-primase 180 kda subunit 
[Drosophila melanogaster. Peptide. 1490 aa] (Match DVV) 
>gil546972lbbsl 148992 (S70576) putative receptor tyrosine kinase=Drc 
[Drosophila melanogaster. Canton-S, Peptide Partial. 817 aa] (Match DAV) 
>gil321036ipirllPS0443 potassium channel protein Slo G3 - fruit fly (Drosophila 
melanogaster) (fragment) (Match DLV) 



C. elegans 

>ail465792lsplP34428IYL37_CAEEL HYPOTHETICAL 45.5 KD PROTEIN 
F44B9.7 IN CHROMOSOME III. gil6306261pirilS448 10 F44B9.7 protein - 
Caenorhabditis elegans gil388589 (L23648) putative [Caenorhabditis elegans] 
(Match DQV) 

>gil466054lsplP34680IYO42_CAEEL HYPOTHETICAL 32.7 KD PROTEIN 
ZK757.2 IN CHROMOSOME III. gil4S2218lpir!IS4l()12 hypothetical protein 
ZK757.2 - Caenorhabditis eiegans gil43S368 (Z291 2 1) ZK757.2 [Caenorhabditis 
elegans] (Match DVV) 

>gil458953 (U00031 ) similar to phosphatidylsenne decarboxylase [Caenorhabditis 
elegans] (Match DGV) 

>gil722365 (U22833) W02B3.5 [Caenorhabditis elegans] (Match DFV) 
>gil746503 (U23516) B0416.2 gene product [Caenorhabditis elegans] (Match 
DDV) 

>gil 10 19950 (U37429) similar to protein kinase C [Caenorhabditis elegans] (Match 
DSV) 

>gil 1055055 (U39850) coded for by C. elegans cDNA yk37gl.5: coded for by C. 

elegans cDNA yk5c9.5: coded for by C. elegans cDNA ykla9.5: alternatively 

spliced form of" F52C9.8b [Caenorhabditis elegans] (Match DNV) 

>gitl0551 10 (U39995) coded for by C. elegans cDNA yk25b9.3: coded for by C. 

elegans cDNA yk25b9.5 [Caenorhabditis elegans] (Match DRV) 

>gil 1086851 (U41270) Similar to transmembrane domain of family 1 of G-protein 

coupled receptors. [Caenorhabditis elegans] (Match DEV ) 

>gil 1082 139 (Z681 18) R01E6.2 [Caenorhabditis eiegans] (Match DFV') 
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>cill 100S68lgnllPIDIe212230 (Z68135) ZKHr3.2 [Caenorhabditis elegans] 
( Match DNV ) 

>gi!1352438lsplQ10055IIF4N_SCHPO EUKAR YOTIC INITIATION FACTOR 
4A-LIKE PROTEIN C1F5.10. gill 103737 (Z6S136) unknown 
[Schizosaccharomyces ponibe] (Match DMV) 

>aill 1 18060 (U41552) coded for by C. elegans cDNA yk3dl 1.5: coded lor by C. 

elegans cDNA yk5f4.5 [Caenorhabditis elegans] (Match DIV) 

>2ill 125770 (L'42838) T0SG2. 2 gene product [Caenorhabditis elegans] (Match 

DDV) 

>sill 185450 (U36581 ) cyclophilin isoform 9 [Caenorhabditis elegans] (Match 
DLV) 

>eill229053lgnllPIDIe229193 (Z70207) F15A2.6 [Caenorhabditis elegans] (Match 
DKV) 

>eil 1255324 (U51999) C43H6.7 gene product [Caenorhabditis elegans] (Match 
DIV) 

>sil 1255397 (U53150) F20A1.2 gene product [Caenorhabditis elegans] (Match 
DSV) 

>sill313955ignlIPIDIe241752 (Z73098) T21C9.13 [Caenorhabditis elegans] (Match 
DIV) 

>gil 16277 17lgnllPIDIe276022 (Z81053) E02A10.4 [Caenorhabditis elegans] 
(Match DIV)" 

>gill627903lgnllPIDIe275743 (Z81076) F35C5T [Caenorhabditis elegans] (Match 
DGV) 

>gil 1658357 (U64849) K04A8.8 gene product [Caenorhabditis elegans] (Match 
DKV) 



S. cerevisiae 

>gil728821lsplP39010IAKRl_YEAST ANKYRIN REPEAT-CONTAINING 
PROTEIN AKR1. gil626094lpirllS48521 AKR1 protein - yeast (Saccharomyces 
cerevisiae) gil466522 (L31407) ankynn repeat-containing protein [Saccharomyces 
cerevisiae] gil 1 230637 (U51030) Ankynn repeat- 
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containing protein (Swiss Prot. accession number P39010). [Saccharomyces 
cerevisiaej gil 1586336lprflI22()3403A ankynn repeat-containing protein 
[Saccharomvces cerevisiae] (Match DMVi 

>gil731840lsplP40500IYII9_YEAST HYPOTHETICAL 23.9 KD PROTEIN IN 
SGA1-THS1 INTERGENIC REGION, gil 1 0777S5ipiri!S4979 1 hypothetical protein 
YI9910.0" - yeast (Saccharomvces cerevisiae) gil577125 (Z46728) YI9910.07. 
unknown orf. lcn: 205. CAI: 0.1 1 [Saccharomvces cerevisiae] gil763257 (Z47047) 
unknown [Saccharomvces cerevisiae] (Match DEV) 

>aiU40345isplP25554IYCB0_ YEAST HYPOTHETICAL 16.6 KD PROTEIN IN 
GBP2-PEL1 INTERGENIC REGION. gil83 138lpirllS 19337 hypothetical protein 
YCLOlOc - yeast (Saccharomvces cerevisiae) gil5358lgnllPIDIe264452 (X59720) 
YCLOlOc, len:146 [Saccharomyces cerevisiae] (Match DTV) 
>gil731426lsplP39941IYEI0_YEAST HYPOTHETICAL 56.5 KD PROTEIN IN 
HXT8 5'REGION. gill077619lpirllS50519 hypothetical protein YEL070w - yeast 
(Saccharomvces cerevisiae) gil603248 (U 18795) Yel070p [Saccharomvces 
cerevisiae] gil 13026 10lgnllpfD!e239852 (Z71688) ORF YNR073c [Saccharomyces 
cerevisiae] (Match DQV) 

>aill 174566lsplP41896IT2FB_YEAST TRANSCRIPTION INITIATION FACTOR 
IIF, BETA SUBUNIT (TFIIF-BETA) (TFIIF MEDIUM SUBUNIT) 
(TRANSCRIPTION FACTOR G 54 KD SUBUNIT). gill078424lpirllB55482 
transcription initiation factor IIF 54K chain - yeast (Saccharomyces cerevisiae) 
gil639703 (U 130 16) transcription initiation factor TFIIF middle subunit 
[Saccharomyces cerevisiae] (Match DVV) 

>gil825501 (L42348)HOLl [Saccharomyces cerevisiae] (Match DGV) 
>gil258767lbbsll 17066 cytochrome c oxidase Via subunit homolog 
[Saccharomyces cerevisiae. JHRY1-2 alpha. Peptide Partial. 19 aa. segment 1 of 5] 
(Match DKV) 

>gil847740 (U19781) beta-fructofuranosidase 2 precursor [Saccharomyces 
cerevisiae] (Match DTV) 

>gil914979 (U32445) P8283.8 gene product [Saccharomyces cerevisiae] (Match 
DRV) 

>gil 135304 lisplP46984IYJS4_YEAST HYPOTHETICAL 13.6 KD PROTEIN IN 
SWE1-ATP12 INTERGENIC REGION. gill077849lpirllS56967 hypothetical 
protein YJL184w - yeast (Saccharomyces cerevisiae) gil 1008389 (Z49459) ORF 
YJL184w. pid:e20T216 [Saccharomvces cerevisiae] (Match DAV) 
>gill352875lsplP47104IYJ03_YEAST HYPOTHETICAL 154.9 KD PROTEIN IN 
MER2-PET191 INTERGENIC REGION, gil 1077S78!pirllS57052 hypothetical 
protein YJR033c - yeast (Saccharomyces cerevisiae) gil 10 15679 (Z49533) ORF 
YJR033c. pid:c203690 [Saccharomyces cerevisiae] (Match DFV) 
>gilll29167 (X87297) J 1590 gene product [Saccharomyces cerevisiae] (Match 
DFV) 

>gilll34890 (Z68290) Akrlp [Saccharomyces cerevisiae] gil 1226040 (Z70202) 
Akrlp [Saccharomyces cerevisiae] (Match DMV) 
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>gil 1 3025^4lgnliPIDie239S4 1 iZ71670. OkF YNR055c [Saccharomyces 
cerevisiae] (Match DG\') 

>gill322S79lgnliPIDie243SS~ iZ72^4S^ ORF YGL22b\v [Saccharomyces 
cerevisiae] (Match DLV) 

>gill322961ignl!PIDIe24336o lZ72"90> ORF YGR005e [Saccharomyces 
cerevisiae] (Match DVV) 

>gill323286lgnllPIDie243550 (Z72948) ORF YGR163\v [Saccharomyces 
cerevisiae] (Match DDV) 

>gi!1420794lgnliPIDIe252191 (Z75275) ORF YOR367\v [Saccharomyces 
cerevisiae] (Match DIV) 



Other 



>gii401194ispiP31015ITNA2_SYMTH TRYPTOPHANASE 2 (L-TRYPTOPHAN 
INDOLE-LYASE 2). gil477858ipirllB49022 tryptophanase (EC 4. 1 .99. 1 ) Tna2 - 
Symbiobacterium thermophilum gil2 1 6979 (D10013) tryptophanase 
[Symbiobacterium thermophilum] (Match DLV) 

>gill55612 (L09651) phosphoglycerate mutase [Zymomonas mobilis] (Match 
DLV) 

>gill361344lpirllD36891 transfer complex protein TrsC - Staphylococcus aureus 
gi!3 10610 (LI 1998) putative [Staphylococcus aureus] gi!405562 (L19570) putative 
[Plasmid pSK41] gil739958iprfll2004267D membrane protein traC [Staphylococcu 
sp.] (Match DDV) 

>gi!625710ipirllC49695 4-methyl-5-(beta-hydroxyethyl)thiazole monophosphate 
synthesis protein ThiF - Escherichia coli gi!414234 (M88701 ) thiF [Escherichia 
coli] (Match DPV) 

>gil97777lpirllA38729 pyruvate decarboxylase ( EC 4.1.1.1) - Sarcina ventnculi 
(fragment) gil249565tbbsl 103674 pyruvate decarboxylase {EC 4.1.1.1 } [Sarcina 
ventriculi. strain JK. Peptide Partial. 36 aa] (Match DYV) 
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>eil29S240lbbsl 1 25733 DNA polymerase homolog [bacterium-like organism, citrus 
screening disease-associated. Peptide. 207 aa] (Match DLV) 

>eil477~l 73lpirll A4S368 N5.N 10-methenyltetrah\ dromethanopterin cyclohydrolase 
- Archaeoglobus fulgidus (fragment) gil29988 1 Ibbsl 1 30469 N5.N10- 
methenvltetrahvdromethanoptenn cyclohydrolase { N-terminal } [Archaeoglobus 
fulgidus. VC-19. DSM 4304. Peptide Partial. 3S aa] (Match DGV) 
>2i4406020 (U01764) unknown [Mycoplasma genitalium] (Match DSV) 
>eil414513 (U021 13) homology to nbosomal protein LI Zl 1839 [Mycoplasma 
genitalium] (Match DVV) 

>eil396331 (U00006) similar to E. coli ChlN [Escherichia coli] (Match DPV) 
>gil543897lsplP35804IBLIP_STRCL BETA-LACTAMASE INHIBITORY 
PROTEIN PRECURSOR (BLIP). gil98890lpirllA36710 beta-Lactamase inhibitory 
protein precursor - Streptomyces clavuligerus gi! 153 192 (M34538) beta-lactamase 
inhibitory protein precursor [Streptomyces clavuligerus] (Match DLV) 
>ci!53S757ipirllA53488 heat shock cognate protein 66 - Escherichia coli gil454766 
(U05338) Hsc66 [Escherichia coli] (Match DEV) 

>gil461079lbbsl 142342 GroEL homolog { N-tcrminal } [Francisella tularensis. LVS. 
Peptide Partial. 18 aa] (Match DGV) 

> 2 il547685isplP36541IHSCA_ECOLI HEAT SHOCK PROTEIN HSCA (HSC66). 
gill073308lpirllB36958 66K hsp70 homolog HscA - Escherichia coli gil402675 
(U01827) Hsp70 [Escherichia coli] (Match DEV) 

>giU29002lsplP07061INYLB_FLASP 6-AMINOHEXANOATE-DIMER 
HYDROLASE (NYLON OLIGOMERS DEGRADING ENZYME EII). 
sil77553lpirllA29516 6-aminohexanoate-dimcr hydrolase (EC 3.5.1.46) EII - 
Flavobacterium sp. KI72 plasmid pOAD2 gil434l8 (X00046) EII enzyme 
[Flavobacterium sp.] gi!488340 (D26094) 6-aminohcxanoatc-dimer hydrolase 
[Flavobacterium sp.] gil223803iprfll0912258A enzyme RSIlA.nylon degrading 
[Flavobacterium sp.] (Match DAV) 

>sil488342 (D26094) 6-aminohexanoate-dimer hydrolase [Flavobacterium sp.] 
(Match DAV) 

>sil507769 (U09675) RNA polymerase beta subunh [Liberobacter africanum] 
(Match DGV) 

>2illl891 lisplP10740IDPSD_ECOLI PHOSPHATIDYLSERINE 
DECARBOXYLASE PROENZYME. gil78759ipir!IA29234 phosphatidylserine 
decarboxylase (EC 4.1.1.65) precursor - Escherichia coli gil537004 (U14003) 
phosphatidylserine decarboxylase [Escherichia coli] gii551827 ( J03916) 
phosphatidylserine decarboxylase [Escherichia coli] (Match DQVi 
>2i!1361237lpirllS56466 phosphotransferase system trehalose permease - 
Escherichia coli gi!537082 (U 14003) phosphotransferase system trehalose 
permease [Escherichia coli] ( Match DIV ) 

>2il4792201pirllS32798 merR protein - Xanthomonas sp. transposon Tn5053 
gi"l480554lpirllS37035 regulatory protein merR - Alcaligenes sp. 

FIG. 8E-1 



SUBSTITUTE SHEET (RULE 26) 



WO 98 23781 



PCT7US9" 21861 



25/50 



eil480563lpirllS?7044 regulator) protein merR - Pseudomonas fluorescens 
eil 10861 70ipirilS5 1756 regulator) protein merR - Pseudomonas testosteroni 
gill 549 10 (L03729) putative [Transposon Tn5053] gii38S554 (L20693) mer operon 
regulator [Alcaligenes sp.] gii393198 (L20694) mer operon regulator [Plasmid 
pMER05] gil397588 (Z23094) merR regulatory protein (repressor /inducer) 
[Alcaligenes sp.] gil397618 (Z23095) merR regulatory protein (repressor /inducer) 
[Pseudomonas fluorescens] gil483767 (X73112) mercury resistance DNA-binding 
protein [Pseudomonas fluorescens] gi!607170 (Z33481) regulatory protein 
[Comamonas testosteroni] giI7 10575 (L405S5) merR regulatory protein (repressor 
/inducer) [Transposon Tn5053] (Match DAV) 

>gill42082 (L02520) ribulose 1.5-bisphosphate carboxylase/oxygenase large 
subunit [Anabaena sp.] gil 142086 ( L02521 ) nbulose 1 .5-bisphosphate 
carboxylase/oxygenase large subunit [Anabaena sp.] gil 142088 (L02522> ribulose 
1.5-bisphosphate carboxylase/oxygenase large subunit [Anabaena sp.] gil 142 105 
(J01540) ribulose- 1, 5-bisphosphate carboxylase large subunit (rbcL) [Anabaena 
sp.] (Match DTV) 

>gill075610lpirllS52644 phycobilisome maturation protein - Synechococcus sp. 
gil 142 130 (M94218) phycobilisome maturation protein [Anacystis nidulans] 
gil446765lprfll 1912291 J phycobilisome maturation protein [Synechococcus sp.] 
?Match DRV) 

>2il466182lsplP35151IYPUB_BACSU HYPOTHETICAL 7.2 KD PROTEIN IN 
PPIB-SIPS INTERGENIC REGION (ORFX1 ). gU6291 18lpirilS45538 hypothetical 
protein XI - Bacillus subtilis gi!410120 (L09228) ORFX1 [Bacillus subtilis] 
(Match DRV) 

>dl 142967 (Ml 7642) succinate dehydrogenase [Bacillus subtilis] (Match DRV) 
>gil 1 1 86 1 3lsplP08066IDHSB_B ACSU SUCCINATE DEHYDROGENASE IRON- 
SULFUR PROTEIN. gill075923lpirllB27763 succinate dehydrogenase (EC 
1.3.99.1) iron-sulfur protein - Bacillus subtilis gil 143527 <M 13470) iron-sulfur 
protein [Bacillus subtilis] (Match DRV) 

>gil 144453 (M94320) very similar to DNA polymerase of Bacillus subtilis 
bacteriophage SP02; potential DNA polymerase: putative fCitrus greening disease- 
associated bacterium-like organism] (Match DLV) 
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>gil785S7lpirllG25035 hypothetical protein 2 - Escherichia coli plasmid Colla 
gil455439 (Ml 38 19) ORF2 f Plasmid Colla] (Match DDV) 
>gil7S588lpirllH25035 hypothetical protein 2 - Escherichia coli plasmid Collh 
g i|45544i (M 13820) ORF2 [Plasmid Collb] (Match DDV) 
">2ill45313 (KOI 304) L-ribulokinase (araB) [Escherichia coli] (Match DS X ) 
>gill20350lsplP26608IFLIS_ECOLI FLAGELLAR PROTEIN FLIS. gil 145989 
(M85240) flasellar protein [Escherichia coli] (Match DPV) 
>gill25924lsplP26593ILACD_LACLA TAGATOSE 1.6-DIPHOSPHATE 
ALDOLASE. gil97943lpirllD39778 LacD tagatose- 1.6-diphosphate aldolase - 
Lactococcus lactis gil 149396 (M65190) lacD [Lactococcus lactis] gil 149409 
(M60447) tagatose 1,6-diP aldolase [Lactococcus lactis] (Match DKV) 
>gil68525ipirilSYEXI isoleucine-tRNA hgase (EC 6.1.1.5) - Methanobactenum 
thermoautotrophicum gill49728 (M59245) transfer RNA-Ile synthetase 
[Methanobactenum thermoautotrophicum] (Match DKV) 
>2ill50352 (M841 13) ORF1 [Transposon mini-Tn3Cm] (Match DAV) 
>gi!121875lsplP24375IGVPK_HALHA GVPK PROTEIN. gil8 1055lpirllJQl 128 
GvpK protein - Halobactenum halobium plasmid pNRClOO gil43524 (X55648) 
gvpK gene product [Halobacterium halobium] gil455299 (M58557) gas vesicle 
protein [Plasmid pNRClOO] (Match DDV) 

>gill27013lsplP13111IMERR_SERMA MERCURIC RESISTANCE OPERON 
REGULATORY PROTEIN. gil96175lpirllA33858 merR protein - Escherichia coli 
plasmid pDU1358 gil455313 (M24940) mercury resistance protein [Plasmid 
pDU 1 358] (Match DAV) 

>gil 150838 (K02336) EII enzyme (6-aminohexanoic acid linear oligomer 

hydrolase) [Plasmid pOAD2] (Match DAV) 

>2il294462 (M28607) insB [Escherichia coli] (Match DKV) 

>gill21389lsplP13556IGLNB_RHOCA NITROGEN REGULATORY PROTEIN 

P-II. gil 15 1934 (M28244) glutamine synthetase glnB (EC 6.3.1.2) [Rhodobacter 

capsulatus] gil829596 (U25953) PII protein [Rhodobacter capsulatus] (Match 

DAV) 

>gill35828lsplP274771THTR_SYNP7 PUTATIVE THIOSULFATE 
SULFURTRANSFERASE PRECURSOR (RHODANESE-LIKE PROTEIN). 
gil28021 llpirllA43669 rhodanese homolog rhdA precursor - Synechococcus sp. 
2ill54604 (M65244) rhdA [Svnechococcus sp.] (Match DRV) 
>gi!731 176lsplP^981IXYLR_THERS PUTATIVE XYLOSE REPRESSOR. 
gii632297ipirllS41787 xylR protein - Thermophilic bacterium gi!311188 (LI 8965) 
putative xvlose repressor 2ene: putative [Thermophilic bacterial sp.] (Match DYV) 
>gill 175762lsplP46015IYDEB_ANASP HYPOTHETICAL PROTEIN IN DEVB 
5'REGION. gil556606 (U 14553 ) ORF [Anabaena sp.] (Match DYV) 
>gill072948TpirllS51047 mauR protein - Paracoccus denitrificans gi!558803 
(U 12464) LysR-type transcriptional activator [Paracoccus denitrificans] (Match 
DAV) 
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>gil629404lpirllS48S3? cytochrome-c3 hydrogcnase (EC 1.12.2.1 ) alpha chain - 
Pvrococcus furiosus gi!563905 (X75255) hydrogenase (alpha subunit) [Pvrococcus 
funosus] (Match DGV) 

>gill30794lsplP07781IPQQ2_ACICA COENZYME PQQ SYNTHESIS PROTEIN 
II. gil953 18lpirllE32252 gene II protein - Acinetobacter calcoaceticus gil3S744 
(X06452) sene II [Acinetobacter calcoaceticus] (Match DLV) 
>eill2S258lsplP10996INIFE_CLOPA NITROGEN ASE IRON-MOLYBDENUM 
COFACTOR BIOSYNTHESIS PROTEIN NIFE. gil80505ipirllS04079 nitrogenase 
(EC 1.18.6.1) molybdenum-iron protein nifE - Clostridium pasteurianum gil40587 
(X13606) NifE protein (AA 1 - 456) [Clostridium pasteurianum] (Match DYV) 
>gil547614lsplP36553IHEM6_ECOLI COPROPORPHYRINOGEN III OXIDASE. 
AEROBIC (COPROPORPHYRINOGEN ASE) (COPROGEN OXIDASE). 
gill073344lpirllB36964 coproporphyrinogen oxidase (EC 1.3.3.3), aerobic - 
Escherichia coli gil453969 (X75413) coproporphyrinogen oxidase [Escherichia 
coli] (Match DWV) 

>gil95681lpirllS06878 beta-Galactosidase (EC 3.2.1.23) - Escherichia coli 
(fragment) gil41904 (X16313) lacZ 5'-region [Escherichia coli] (Match DGV) 
>gil78569lpirllS04774 hypothetical protein - Escherichia coli (fragment) gil42746 
(X15859) open reading frame (122 AA); pid:g42746 [Escherichia coli] (Match 
DQV) 

>si!129003lsplP07062INYLC_FLASP 6-AMINOHEXANOATE-DIMER 
HYDROLASE (NYLON OLIGOMERS DEGRADING ENZYME EH ). 
gil77554lpirllB22644 6-aminohexanoatc-dimcr hydrolase (EC 3.5.1.46) EH' - 
Flavobactenum sp. plasmid pOAD2 gi!43420 (X02864) EH' (aa 1-392) 
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[Flavobactenum sp.] gi!223804iprfllO c > 1 2258B enzyme RSIIB. nylon degrading 
[Flavobacterium sp.] (Match DAV) 

>2il79O56lpirl!JH0207 hypothetical 10.8K protein - Enterococcus taecalis plasmid 
pAM-beta-l eil45739 (X17092) ORFF (ttg start codon) [Enterococcus taecalis] 
(Match DFV) 

>eill l4867isplP26l77IBCHX_RHOCA CHLOROPHYLLIDE REDUCTASE 35.5 
KD CHAIN (CHLORIN REDUCTASE). gi!795 l3lpirllS 17823 proiochlorophyllide 
reductase (EC l. 3.1. 33) 35. 5K chain - Rhodobactcr capsulatus gil46131 (Zl 1 165) 
333 aa (35.5 kD) chlorophillide reductase subunil. also known as chlorophyll Fe 
protein [Rhodobacter capsulatus] (Match DDV) 

>gi!116927lsplP24716ICOPR_STRAG PLASMID COPY CONTROL PROTEIN 
COPR. gil98007lpirllS22829 hypothetical protein - Streptococcus agalactiae 
gil581557 (X62150) 92 aa polypeptide [Streptococcus agalactiae] gil769739 
(X72021) circular [Streptococcus asalactiae] (Match DFV) 
>gi!134993lsplP09398ISTRG_STRGR STREPTOMYCIN BIOSYNTHESIS 
PROTEIN STRG. gil80801 ipirilS 17777 strG protein - Streptomyces gnseus 
sil49266 ( Y00459) strG [Strepiomvccs onseus] (Match DTV) 
>gil401018lsplP31814IRPOB_THECE DNA-DIRECTED RNA POLYMERASE 
SUBUNIT B. gi!280354ipirllS25563 DNA-directed RNA polymerase (EC 2.7.7.6) 
chain B - Thermococcus celer gi 148 140 (X67313) Subunit B of DNA-dependent 
RNA polymerase [Thermococcus celer] (Match DRV) 

>gil625666lpirllA36925 LysR-type transcriptional activator CbbR - Xanthobacter 
flavus gil581832 (Z22705) DNA-binding protein [Xanthobacter flavus] (Match 
DPV) " 

>gii5 15608 (Z35397) C. sativus 3-ketoacyl-CoA thiolase [Arabidopsis thaliana] 
(Match DIV) 

>gil451328 (U02021 > ecdysteroid receptor [Aedes aegypti] (Match DQV) 
>gil413919 (D21 101) Guanyi Cyclase [Hemicentrotus pulchernmus] (Match DDV) 
>2il5 14269 (U07706) dihvdropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>gil505159 (Z30659) dihvdropteroate synthetase [Plasmodium falciparum] 
gil505 169 (Z30665) dihvdropteroate synthetase [Plasmodium falciparum] 
gil505171 (Z30655) dihvdropteroate synthetase [Plasmodium falciparum] 
gil505175 (Z30657) dihvdropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>gil505161 (Z30660) dihvdropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>2il505163 (Z30653) dihvdropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 
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>eii5051o5 (Z30664) dihydropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>eii6?()466lpirllS47 154 dihydropicrin pyrophosphokinase/dihydroptcroaie 
svnthctase - Plasmodium falciparum gi 1505 1 79 (Z31584) Dihydroptenn 
pvrophosphokinase and Dihydropteroate synthetase [Plasmodium falciparum] 
(Match DQV) 

>2il505167 (Z30654) dihydropteroate synthetase [Plasmodium falciparum] 
cil505173 (Z30656) dihydropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>gil505177 (Z30658) dihydropteroate synthetase [Plasmodium falciparum] (Match 
DQV) 

>sil585279lsplQ08169IHUGA_APIME HYALURONOGLUCOSAMINIDASE 
PRECURSOR (HYALURONIDASE) (ALLERGEN API M II) (API M 2). 
gil476996lpirllA47477 hyaluronidase - honeybee gil 155680 (L10710) hyaluronidase 
[Apis mellifera] (Match DQV) 

>gil 159276 (M6461 1 ) putative [Hydra vulgaris] (Match DVV) 

>2il552162 (L28823) reverse transcriptase [Phlebotomus perniciosus] (Match 

DTV) 

>gil 160301 (M 152 12) glycophorin binding protein [Plasmodium falciparum] 
(Match DEV) 

>gilll8063lsplP16065ICYGS_STRPU SPERACT RECEPTOR PRECURSOR 
(GUANYLATE CYCLASE). gi!279588lpirllOYURCP speract receptor precursor - 
sea urchin (Strongylocentrotus purpuratus) gil 16 1477 (M22444) guanylate cyclase 
[Strongylocentrotus purpuratus] (Match DDV) 

>gil556182 (L36665) ORF; putative [Gonyaulax polyedra] (Match DLV) 
>gill63188 (L06320) alpha-interferon receptor [Bos taurus] (Match DSV) 
>gil246581lbbsl86109 zona pellucida-binding protein. AWN- 1 =C 13" fragment 
[swine, sperm. Peptide Partial. 10 aa] (Match DXV) 

>gil399217lsplP30932ICD9_BOVIN CD9 ANTIGEN. gil89462lpirllJX0221 CD9 
antigen - bovine gil 162821 (VI81720) CD9 antigen [Bos taurus] (Match DMV) 
>sil562100 (U 15975 ) putative brain rvanodine receptor [Sus scrofa] (Match DQV) 
>2il462415lsplQ04790IINRl_BOVIN INTERFERON- ALPHA/BETA RECEPTOR 
ALPHA CHAIN PRECURSOR (IFN- ALPHA-REC ). gil346520lpirllS27387 
interferon alpha receptor type 1 - bovine gil432 (X68443) interferon receptor type 1 
[Bos taurus] (Match DSV) 
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>mll37049lsplP()l 145IURl_CATCO UROTLNSIN I. gil69()6olpirllUOCC 1 M 
urotcnsin I - white sucker iiiI268092!gblI02Z~7l Sequence 2 from Patent I'S 
452818*-) nil270 < 444leblI0I722l Sequence 2 from Patent US 4<-H)8352 (Match DEV ) 
>gil2681 13l2blI02366l Sequence 1 from Patent US 4533654 gil2681 14lgblI02367l 
Sequence 2 from Patent US 4533654 (Match DEV) 

>gil268397!gblI03062l Sequence 4 from Patent US 4605642 (Match DEV) 
>2il268996lgblI00642l Sequence 8 from Patent US 4742157 (Match DXV) 
>gil270Q45lgbtI0 17241 Sequence 3 from Patent US 4908352 (Match DEV) 
>sil227 c ) < 4 1 iprill 1 7 1 4327 A urotensm I [Hippoglossoides elassodon] 
£il270946lgblI0 17261 Sequence 4 from Patent US 4908352 (Match DEV) 
>cil5923 18lgblll 1 1631 Sequence 4 from Patent WO 8906658 (Match DSV) 
>gil5931 18lgbll 103471 Sequence 3 from Patent WO 8705938 (Match DTV) 
>gil594746l2blI04467l Sequence 7 from Patent EP 0162738 (Match DGV) 
>2ill32051!splP00875IRBL_SPIOL RIBULOSE BISPHOSPHATE 
CARBOXYLASE LARGE CHAIN PRECURSOR. gil68 1 33lpirliRKSPL ribulose- 
bisphosphate carboxylase (EC 4.1.1.39) large chain precursor - spinach chloroplast 
gil2313l2lpdbl8RUBIL Ribulose 1.5-Bisphosphate Carboxylase( Slash (oxygenase 
(E.C.4.1.1.39) Complex With Co2.Mg++ And 2-Carboxyarabinitol- 1.5- 
Bisphosphate gill 2291 (V00168) ribulose L543isphophate carboxylase [Spinacia 
oleracea] gil343375 (JO 1443) ribulose bisphosphate carboxylase large subunit 
[Spinacia oleracea] (Match DTV) 

>gill 1 1564lpirllS09074 cytochrome P450-4b - rat (fragment) (Match DGV) 
>gil82261lpirllS06161 chitinase (EC 3.2.1.14) - potato (fragment) giI21465 
(X14133) endochintinase (315 AA) [Solanum tuberosum] (Match DTV) 
>2il84502lpirllB28563 hemoglobin chain IV - earthworm (Lumbricus terrestns) 
(fragment) (Match DDV) 

>gil84636lpirllS0()492 hemocyanin chain la - Japanese spiny lobster dragment) 
(Match DDV) 

>gil320206lpirllS28389 acyl carrier protein - Escherichia coli (fragment) (Match 
DTV) 

>gi!2813331pirllPQ0397 nonstructural protein NS5 - hepatitis C virus (isolate E- 
bl2) (fragment) (Match DPV) 

>gil538860lpirllA61213 photorcaction center protein H - Rhodospirillum rubrum 
gii227675lprllll709158B puh gene [Rhodospirillum rubrum] (Match DRV) 
> 2 ii979Q4ip ir ||G35905 hypothetical protein 1 (Sm2) - Streptococcus mutans (Match 
DIV) 

>sil799 Q 5lpirllA2S55 1 hypothetical protein 1 - Streptococcus mutans (strain GSo> 
gill 196925 (M18954) unknown protein [Streptococcus mutans] (Match DIV) 
>gil48301SipirllB47607 immunogenic protein MPB70/MPB80 - Mycobacterium 
bovis (strain BCG) (fragment) (Match DPV) 
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>gill 174S53isplP42743IUBCY_ARATH UBIQUITIN-CONJUGATING ENZYME 
E2-18 KD (UBIQUITIN-PROTEIN LIGASE) lUBIQUITIN CARRIER 
PROTEIN) (PM42). gil48181 1 lpirllS394S3 ubiquitin-conjugating enzyme UBC2-1 - 
Arabidopsis thaliana gil22658 (X68306) ubiquitin-conjugating enzyme 
[Arabidopsis thaliana] (Match DKV) 

>gil22549 1 Iprfll 1 30430 1 B glycoprotein S8 [Brassica rapa] (Match DLV) 
>gill37055lspllURl_PLAFE_2 [Segment 2 of 2] UROTENSIN I PRECURSOR. 
gil280657lpirllA43978 urotensin I - European flounder gil227317iprflll701464A 
urotensin I [Platichthys flesus] (Match DEV) 

>gil87715lpirllPH0159 HLA-DRB sigma antigen DRB 1-070 1-Dw 17 - human 
(Match DTV) 

>gil87718lpirllPT0162 HLA-DRB sigma antigen DRB 1-090 1-Dw23 - human 
(Match DTV) 

>gil91588lpirllPT0641 T-cell receptor beta chain V-D-J region (120-2R) - mouse 
(fragment) (Match DWV) 

>gil481922lpirllS40164 hemagglutinin-neuraminidase - Newcastle disease virus 
gil437889 (X71994) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gilll69937lsplP43519IGLNB_RHOSH NITROGEN REGULATORY PROTEIN 
P-II (PII SIGNAL TRANSDUCING PROTEIN). gil42 1339lpirllS33 180 glnB 
protein - Rhodobacter sphaeroides gil809751 ( X71659) glnB gene product 
[Rhodobacter sphaeroides] g-il 1586928lprf1l2205239A Glu synthetase [Rhodobacter 
sphaeroides] (Match DAV) 

>gil98843lpirllS 14091 40K protein - Saccharopolyspora erythraea (Match DAV) 
>gil4791791pirllS32438 pol polyprotein - Volvox carten retrotransposon VCRT-I-1 
(fragment) gil938289 (X69621 ) reverse transcriptase [Volvox carteri] (Match 
DDV) 

>gil479181lpirllS32440 pol polyprotein - Volvox carteri retrotransposon VCRT-I-3 
(fragment) gil938291 (X69623) reverse transcriptase [Volvox carteri] (Match 
DDV) 

>gil4791S3lpir!IS32-M-2 pol r olynrotein - Volvox carteri retrotransposon \'CRT-I-h 
(tragment) gil938294 (X69626) reverse transcriptase [Volvox carteri] (Match 
DDV) 
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>gil47 < -MS4lpiillS32443 pol polyproiein - Volvox carteri retrotransposon VCRT-l-S 
(fragment) (Match DDV) 

>gil47 t -US5lpirllS32444 pol polyproiein - Volvox cartcn retrotransposon VCRT-II-1 
(fragment) gil938295 (X69629) reverse transcriptase [Volvox carteri] (Match 
DDV) 

>gil47 < -»188lpirl!S32447 pol polyproiein - Volvox carteri retrotransposon VCRT-II-4 
(fragment) gil938298 (X69632) reverse transcriptase fVolvox carteri] (Match 
DDV) 

>gil479190!pirllS3244 < -) pol polyprotein - Volvox carteri retrotransposon VCRT-I1-3 
(fragment) gi!938297 (X69631) reverse transcriptase [Volvox carteri] (Match 
DDV) 

>gil477748lpirllB47759 reverse transcriptase (copia-like retrotransposon) - upland 
cotton (fragment) gill 673 17 (M94472) reverse transcriptase [Gossypium hirsutum] 
(Match DDV) 

>gill076316lpirllS51478 Dil9 protein - Arabidopsis thaliana gil4691 10 (X78584) 
Dil9 [Arabidopsis thaliana] (Match DEV) 

>gil99777lpirllS 14951 S-locus-specific glycoprotein SLG-8 - field mustard 
gill 7708 (X55274) S-locus glycoprotein [Brassica campesins] (Match DLV) 
>gil478421lpirllJQ2380 S-locus-specific glycoprotein precursor - rape 
gill076455ipirllS42280 S-locus glycoprotein - rape gill67170 (L08608) S-locus 
glycoprotein [Brassica napus] gil904227 CL 10736) S-locus related glycoprotein 
[Brassica napus] (Match DLV) 

>gil99826lpirllS24546 S-locus glycoprotein - rape gill7868 (Zl 1725) S-locus 
glycoprotein [Brassica napus] (Match DLV ) 

>gi!434858 (X76472) pid:g434858 [Crucianella angustifolia] (Match DAV) 
>2il47S565lpirllS 10849 alpha- amylase/trypsin inhibitor - durum wheat (Match 
DYV) 

>eill 17275 llsplP41390IPURl_SCHPO 

AMIDOPHOSPHORIBOS YLTRANSFERASE ( GLUT AMINE 
PHOSPHORIBOSYLPYROPHOSPHATE AMIDOTRANSFERASE) (ATASE). 
gil481335ipirlIS384S2 amidophosphoribosyltransferase (EC 2.4.2.14) - fission yeast 
TSchizosaccharomyces pombe) gil629904lpirllS43526 PRPP amidotransferase (EC 
2.4.2.14) - yeast (Schizosaccharomyces pombei gil410512 (X72293) PRPP 
amidotransferase [Schizosaccharomyces pombe] (Match DFV) 
>gi!542640lpirllA48810 fibrinogen B beia subunii - African clawed frog (fragment) 
gil450951 (U05035) fibrinogen B-beta subunit [Xenopus laevis] (Match DDV) 
>gil477549lpirllA49192 transthyretin - bullfrog (fragment) gil299846lbbsl 130235 
transthyretin. T-T3BP=3,5.3'-L-trnodothyronine-specific binding protein {N- 
terminal} [bullfrogs, tadpole plasma. Peptide Partial. 19 aa] (Match DAV) 
> g il481489lpirllS38695 class II histocompatibility antigen betea chain - slender 
loris (fragment) (Match DTV) 
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>iiil47Sl<SSipiriiE49164 chromogranin-B - rat (fragment) gil239365lbbsio6367 
chromoeranin-B. CgB=glucagonoma pcpudc (rat^. Peptide Partial. 38 aa] (Match 
DNV) 

>gil543521ipirl!B61222 cytochrome-c oxidase I EC 1.9.3.1 ) chain II - 
mitochondrion Steinernema intermcdii (SGC4) (fragment) (Match DE\' ) 
>gil5436~2ipirllJQ2350 protein kinase (EC 2.7.1.37) - turkey herpesvirus gil4()678S 
(X68653) protein kinase homologue [Gallid herpesvirus 2] gii58381 1 i A 18267) 
ORF5 [Gallid herpesvirus 2] gii 1 253294!patlUSI5470734l5 Sequence 5 from patent 
US 5470734 (Match DSV) 

>gil4781881pirllF47758 reverse transcriptase (copia-like retrotransposon) - 
Liriodendron chinense (fragment) gii 168306 (M94477) reverse transcriptase 
[Liriodendron chinense] (Match DDV) 

>gilll6359lsplP23472ICHLY_HEVBR HEV AMINE A (CHITINASE / 
LYSOZYME. gi!82026lpirilS 17205 chitinase (EC 3.2.1.14) hevamine - Para rubber 
tree gil234388lbbs!52808 hevamine [Hevea brasiliensis. Peptide Partial. 273 aaj 
gii 131 1006lpdbllHVQI Glycosidase. Chitin Degradation. Multifunctional Enzyme 
Mol_id: 1: Molecule: Hevamine A: Chain: Null: Ec: 3.2.1.14. 3.2.1.17: Heterogen: 
N-,N'-.N"-Triacetyl-Chitotnose: Other_details: Plant EndochitinaseLYSOZYME 
gill31 1007lpdbllHVMI Glycosidase. Chitin Degradation. Multifunctional Enzyme 
Mol_id: 1: Molecule: Hevamine A: Chain: Null: Ec: 3.2.1.14. 3.2.1.17: 
Other_details: Plant EndochitinaseLYSOZYME gill421554lpdbllLLOI Hevamine 
A (A Plant EndochitinaseLYSOZYME) COMPLEXED WITH Allosamidin 
Chitinase. Lvsozvme Mol_id: 1; Molecule: Hevamine: Chain: Null: Synonym: 
ChitinaseLYSOZYME; Ec: Ec 3.2.1.14. 3.2.1.17: Heterogen: Allosamidin ( Match 
DSV) 

>gil467S22 (U02606) chitinase [Solanum tuberosum] (Match DTV) 
>gil629"7 17lpirllS43317 chitinase (EC 3.2.1.14) - potato (fragment) gil467824 
(U02607 ) chitinase [Solanum tuberosum] (Match DTV ) 

>gil46791 1 (U03086) ribulose- 1.5-bisphosphate carboxylase/oxygenase large 
subunit [Sarcothalia decipiens] (Match DVV) 

>gil514215 (U02963) dynein beta heavy chain [Chlamydomonas reinhardtii] 
(Match DVV) 
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^<nl516^2 (U10078) cvclin IbZm [Zea mavs] (Match DLV) 

>2illl70247lsplP43082iHEVL_ARATH HEYEIN-LIKE PROTEIN PRECURSOR 
iiil40724S (U01880) pie-hcvein-like protein [Arabidopsis thahana] (Match DRVi 
>2il625 L >82lpirllJC2250 S-locus-specific glycoprotein S12 precursor - field mustard 
<iil54723Slbbsl 149323 (S70937) S-glycoprotem [Brassica campestns. S12S12 
homozygotes. stigmas. Peptide. 436 aa] gil743639iprfll20132 16A S glycoprotein 
[Brassica rapa] (Match DLV) 

>2il289868 (L12640) ribulose 1.5-bisphosphate carboxylase large subunit 
[Chloranthus japonicus] (Match DTV) 

>2il460648 (L29492) ribulose 1.5 bisphosphate carboxylase [Comesperma 
ericinum] (Match DTV) 

>eil290939 (L 12649) ribulose 1.5-bisphosphate carboxylase large subunit 
[Hedvosmum arborescens] (Match DTV) 

>eii3 10368 (L19972) ribulose 1.5-bisphosphate carboxylase [Stegolepis alleniil 
(Match DKV) 

>2il484236 (L05041 ) ribulose 1.5-bisphosphate carboxylase large subunit 
ffradescantia sp.] (Match DKV) 

>2il 166459 (L06946) beta-tubulin [Acremonium uncinatum] (Match DAV) 
>gill66467 (L06954) beta-tubulin [Acremonium sp.] gil 168 1 30 (L06959) beta- 
tubulin [Epichloe amarillans] (Match DAV) 

>2iU19975lsplP169721FER_ARATH FERREDOXIN PRECURSOR. 
gil99692lpirllS09979 ferredoxin [2Fe-2S] precursor - Arabidopsis thahana gill 6437 
rX51370) ferredoxin precursor [Arabidopsis thalianaj gill66698 (M35868) 
ferrodoxin A [.Arabidopsis thaliana] (Match DIV) 
>gill67172 (M36301) S-6-glycoprotein [Brassica campestns] 
cil225490lprlll 1 30430 1 A glycoprotein S6 [Brassica rapa] (Match DLV) 
>gil 166461 (L06951 ) beta-tubulin [Acremonium coenophialum] gill 66463 
(L06952) beta-tubulin [Acremonium sp.] gil 166469 (L06963) beta-tubulin 
[Acremonium sp.] cil 16647 1 (L06964) beta-tubulin [Acremonium coenophialum] 
sill 68 122 (L06955) beta-tubulin [Epichloe festucae] gil 168 124 (L06956) beta- 
tubulin [Epichloe festucae] gil 168 126 (L06957) beta-tubulin [Epichloe festucae] 
sil 1681 28 (L06958) beta-tubulin [Epichloe amarillans] gill68133 (L06961) beta- 
tubulin [Epichloe amarillans] gil 168 135 (L06962) beta-tubulin [Epichloe sp.] 
(Match DAV) 

>gil 169359 (J01262) phaseolin [Phaseolus vulgaris) gil897800 ( V01 1 63 ) phaseolin 
[Phaseolus vulgaris] (Match DDV) 

>gil4574()0 (D21840) MAP kinase [Arabidopsis thaliana] (Match DSV) 
>2il310372 (L13485) ribulosebisphosphate carboxylase [Sphagnum palustre] 
(Match DTV ) 

>gil309636 (LI 1058) 'Ribulosebiphosphate Carboxylase' [Ophioglossum 
engelmann] (Match DTV) 

>gil3815 (X00788) 1G2 protein [Schizophyllum sp.j (Match DPV) 
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>gil547991ls P IP36606INAH_SCHPO PROBABLE NAi + )/Hki ANTI PORTER. 
gilS2S 16ipirilS2095 1 Na+/H+ antiporter - fission yeast (Schizosaceharomyces 
pombe) gil5090 (Zl 1736) putative sodium/proton antiporter [Schizosacchuroim vc> 
pombe] ("Match DYV) 

>gil 13453 HsplP22553ISLS2_BRAOA S-LOCUS-SPECIFIC GLYCOPROTEIN 
BS29-2 PRECURSOR, cil 1 7SS9 (X 16123 ) S locus specific glycoprotein [Brassica 
oleracea] (Match DLV) 

>2ill7894 (X55275) S-locus glycoprotein [Brassica oleracea] (Match DLV) 
>gill34534lsplP07761ISLS6JBRAOL S-LOCUS-SPECIFIC GLYCOPROTEIN S6 
PRECURSOR (SLSG-6). gilS1703lpirllA27827 S-locus-specific glycoprotein S6 
precursor - wild cabbage gill 7901 (Y00268) SLSG (AA -31 to 405) [Brassica 
oleracea] gil225542lprflll305350A protein. S locus allele [Brassica oleracea var. 
botrytis] (Match DLV) 

>gil436130 (X76634) ribulosc- 1 .5-bisphosphate carboxylase [Physcomitrella 
patens] (Match DTV) 

>eill346963lsp!P20455IRBL_ATRRS RIBULOSE BISPHOSPHATE 
CARBOXYLASE LARGE CHAIN PRECURSOR. gil995 16lpirllF34921 ribulose- 
bisphosphate carboxylase (EC 4.1.1.39) large chain - Atriplex rosea chloroplast 
eil 11323 (X55831) rubisco larse subunit [Atriplex rosea] (Match DTV) 
>eill31998lsplP19163IRBL_NEUMU RIBULOSE BISPHOSPHATE 
CARBOXYLASE LARGE CHAIN PRECURSOR. gil68147lpirllRKNULM 
ribulose-bisphosphate carboxylase (EC 4.1.1.39) large chain precursor - Neurachne 
munroi chloroplast gill00640ipirllH34921 ribulose-bisphosphate carboxylase (EC 
4.1.1.39) large chain - Neurachne munroi chloroplast gill 1751 (X55828) rubisco 
larse subunit [Neurachne munroi] (Match DKV) 
>2ril31999lsplP19164IRBL_NEUTE RIBULOSE BISPHOSPHATE 
CARBOXYLASE LARGE CHAIN PRECURSOR. gil68 146lpirilRKNULT 
ribulose-bisphosphate carboxylase (EC 4.1.1.39) large chain precursor - Neurachne 
tenuifolia chloroplast gill 0064 llpirllG34921 ribulose-bisphosphate carboxylase (EC 
4.1.1.39) large chain -^Neurachne tenuifolia chloroplast gill 1798 (X55827 ) rubisco 
large subunit [Neurachne tennifolia] (Match DKV) 
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>t:il299258lbbslI27093 (S561S1 ) pyruvate dehydrogenase alpha subunit {C- 
terminal} {EC 1.2.4.1 } [human. Peptide Partial Mutant. 14 aa] (Match DQV'L 
>ml3S5?95lbbsl 133340 (S6207S) platelet-derived growth factor A-ehain. PDGF A 
chain {N-terrninal} [human. Peptide Partial. 53 aa] (Match DSV) 
>gill248S4lsplPl6S0SIIR10_HCMVA HYPOTHETICAL PROTEIN IRL1U 
PRECURSOR (TRL10). gil76487lpirllS09903 hypothetical protein IRL10 precursor 
- human cytomegalovirus^ strain AD 169) gil8331()8 (X 17403) HCMVIRL10 = 
TRL10 ( AA 1-171) [Human cytomegalovirus] (Match DNV) 
>gil 1 34532lsplP 1 7840iSLS3_B R AOL S-LOCUS-SPECIF1C GLYCOPROTEIN 
S13 PRECURSOR (SLSG-13 ). gil81698lpirllB27S27 S-locus-specific glycoprotein 

513 precursor - wild cabbace (fragment) (Match DLV) 

>gill34533lsplP17841ISLS4JBRAOL S-LOCUS-SPECIFIC GLYCOPROTEIN 

514 PRECURSOR (SLSG-14). gil81699lpirilC27S27 S-locus-specific glycoprotein 
S14 precursor - wild cabbace (fragment) (Match DIV) 

>iiil267240lsplP30088IUPA W 2_HUMAN UNKNOWN PROTEIN FROM 2D-PAGE 
OF PLASMA (SPOT 10). (Match DQV) 

>gill 17097lsplP00426ICOXA_BOVIN CYTOCHROME C OXIDASE 
POLYPEPTIDE VA. gil66277ipirllCABO cytochrome-c oxidase (EC 1.9.3.1 ) chain 
Va - bovine gil229632lprfll771727A oxidase heme a.cytochrome [Bos taurus] 
(Match DKV) 

>gill26902lsplP80040IMDH_CHLAU MALATE DEHYDROGENASE. (Match 

dIv) 

>gill26903lsplP80039IMDH_CHLTE MALATE DEHYDROGENASE. (Match 
DVV) 

>gill26906lsplP80037IMDH_HELGE MALATE DEHYDROGENASE. (Match 
DIV ) 

>2ill31906lsplP()0S791RBL_ANASP RIBULOSE BISPHOSPHATE 
CARBOXYLASE LARGE CHAIN PRECURSOR. gil68158ipirllRKAIL7 nbulose- 
bisphosphate carboxylase (EC 4.1.1.39) large chain - Anabaena sp. 
2il223640lprfll0904327A carboxylase.RBP [Anabaena sp.] (Match DTV) 
>2il417995lsplP30I38ITHIF_ECOLI THIF PROTEIN. (Match DPV) 
>gill36991!splP16787IUL96_HCMVA HYPOTHETICAL PROTEIN UL96. 
gil76602lpirllS09861 hypothetical protein UL96 - human cytomegalovirus (strain 
AD169) gil833080 (X 17403) HCMVUL96 ( AA 1-115) [Human cytomegalovirus] 
(Match DAV) 

>gill37504lsplP2l075IVBl~_\'ACCC PROTEIN B17. gil93?()91pirllG42527 B17L 
protein - vaccinia virus (strain Copenhagen) gil335564 (M35027) B17L: putative 
[Vaccinia virus] (Match DNV) 

>gil2672S 1 IsplQO 1 22 1 IVB 1 ~_Y ACC V PROTEIN B 1 7. gil32 1 39 1 IpirilJQ 1810 
B16L protein - vaccinia virus (strain WR) gil222761 (Dl 1079) 39. 5K protein 
[Vaccinia virus] (Match DNV i 
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>giil3S2H)lsplPlS538IVGLB_HSYMD GLYCOPROTEIN B PRECURSOR. 
i!ii7? c ?46lpirllVGBERB glycoprotein B precursor - Marek's disease virus (strain 
RB IB j gi!221837 (D13713) glycoprotein B precursor [Gallic! herpesvirus type 1] 
mil 100890 (U39846) glycoprotein B [Gallid herpesvirus 2] (Match DAY) 
>giLM7619isplP12554IHEMA_NDVA HEMAGGLUTININ- NEURAMINIDASE. 
gil67467ipirllHNNZAV hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain Australia- Victoria virulent) (Match DGV) 
>gil547620lsplP35740IHEMA_NDVC HEMAGGLUTININ-NEURAMINIDASE. 
gil419457ipirliC36S29 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain CHI/85 ) gi!332352 (M24716) hemagglutinin-neuraminidase 
[Newcastle disease virus] (Match DRV) 

>gii547621isplP35741IHEMA_NDVH3 HEMAGGLUTININ-NEURAMINIDASE. 
gil419459lpirllA36829 hemagglutinin-neuraminidase (EC 3.7 !.-) - Newcastle 
disease virus (strain HER/33) (Match DGV) 

>gi!122996ispiP12556lHEMA_NDVI HEMAGGLUTININ-NEURAMINIDASE. 
gil771391pirilS07126 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle disease 
virus (strain Italien) gii332362 (Ml 8640) hemagglutinin-neuraminidase [Newcastle 
disease virus] giI226158lprflll413194A hemagglutinin neuraminidase [Newcastle 
disease virus] (Match DGV) 

>gil547622isplP35742IHEMA_NDVJ HEMAGGLUTININ-NEURAMINIDASE. 
gil419460lpirl!D36829 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain IB A/85) gi!332354 (M24717) hemagglutinin-neuraminidase 
[Newcastle disease virus] (Match DRV) 

>gill22997lsplP12557IHEMA_NDVM HEMAGGLUTININ-NEURAMINIDASE. 
gi!332368 (M 19479) hemagglutinin-neuraminidase glycoprotein [Newcastle 
disease virus] (Match DKV) 

>gill35128lsplP26499ISYI_METTH ISOLEUCYL-TRNA SYNTHETASE 
(ISOLEUCINE-TRNA LIGASE) (ILERS). (Match DKV) 
>gil401222lsplP31779ITTHY_RANCA TRANSTHYRETIN (PREALBUMIN) 
(TADPOLE T3-BINDING PROTEIN) (T-T3BP). (Match DAV) 
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>gil46505SlsplP33878IVB17_VARV PROTEIN B17. gil4 19242lpirlil 36856 B18L 
protein - variola virus (strain India- 1967) gil6282 1 7ip ir IIS46S75 gene B17L protein 
(COP) - variola virus gil439093 (L22579) homolog of vaccinia virus CDS B17L; 
putative [variola major virus] gil457077 (X69198) pid:g457077 [Variola virus] 
2ii5 16436 (X671 17) B17L COP gene product [Variola virus] gil885783 (U 18339) 
D6L [Variola virus] gil885845 (U 18341) B15L [Variola virus] gill 150675 
(X72086) ORF17L: B18L in citation [3] [Variola virus] gil745309lprfll2015436HK 
B18L aene [Variola major virus] (Match DNV) 

><nll38975lsplP15775IVSMP_CVBF PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.7 KD 
PROTEIN). gil74875lpirllMNIHB3 nonstructural protein NS3 - bovine coronavirus 
(strain Fl 5) gil58686 (X51347) NS3 protein (AA 1-84) [Bovine coronavirus] 
(Match DDV) 

>2ill38976lsplP15779IVSMP_CVBM PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.7 KD 
PROTEIN). gil418984lpirllD46346 nonstructural protein NS3 - bovine coronavirus 
(strain Mebus) gil323368 (M3 1054) nonstructural 9.7 kDa protein (put.): putative 
[Bovine coronavirus] (Match DDV ) 

>ai|465439lsplQ04854IVSMP_CVHOC PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.5 KD 
PROTEIN). gil476391lpirltB44275 nonstructural protein NS3 - human coronavirus 
(strain OC43) gii329569 (M99576) 9.5 kDa nonstructural protein [Human 
coronavirus] (Match DDV) 

>2il549520lsplP36566IYCBD_ECOLI HYPOTHETICAL 29.8 KD PROTEIN IN 
KDSB-KICB INTERGENIC REGION. gi!1261828 (D26440) S- 
adenosylmethionine-dependent methltransferase [Escherichia coli] 
eil 1585880lprtll22022 1 1 A Met(S-adenos\i)-dependcnt mcthyltranslerase 
[Escherichia coli] (Match DKV) 

>sil4658671splP34403IYLU9_CAEEL HYPOTHETICAL 14.8 KD PROTEIN 
F10E9.9 IN CHROMOSOME III. (Match DTV) 

>gilll9932lsplP00229IFERl_PHYAM FERREDOXIN I. gil65749lpirllFEFWl 
ferredoxin [2Fe-2S] I - Virsinian pokevveed (Match DIV) 

>2illl9959lsplP149381FER3_RAPSA FERREDOXIN. LEAF L-A. (Match DMV) 
>2ill30608lsplP05960IPOL_HVlC4 POL POLYPROTEIN (PROTEASE 
(RETROPEPSIN) : REVERSE TRANSCRIPTASE : RIBONUCLEASE H. (Match 
DEV» 

>2ill31765lsplP21760IQSP_CHICK QUIESCENCE-SPECIFIC PROTEIN 
PRECURSOR (P20K) (CH21 PROTEIN). gil864 1 7 Ipirll A30230 quiescence- 
specific protein precursor - chicken (Match DEV) 

>gil208939 (M14181 ) preproparathyroid hormone [Artificial gene] gi!209049 
(M14182) preproparathyroid hormone [Artificial gene] gil209052 (M14183) 
preproparathyroid hormone [Artificial gene! i Match DMV) 
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>gil20904S (M141S2) synthetic preproparathyroid hormone (Artificial gene] 
izil20905 1 (M 141 S3) synthetic preproparathyroid hormone [Artificial gene] (Match 
DMV) 

>gil344735 (A04054) MDY gB gene product [unidentified] gil4 12763 i A06I47> gB 

gene product [unidentified] (Match DAY) 

>gil221553 (D10134) NS-5 [Hepatitis C virus] (Match DPV) 

>gil234099ibbsl52140 NS3 protein [bovine ententic coronavirus BEC\'. strain F15. 
Peptide. S4 aa] (Match DDV) 

>gil256415lbbsll 14657 VP3=major structural polypeptide { N-terminal } [infectious 
flacherie virus IFY. silkworm Bombyx mori. Peptide Partial. 15 aa] (Match DIV) 
>gil454753 (U04469) polymerase [Desert Shield virus] (Match DGV) 
>gill364135lpirllE49600 probable aphid transmission factor - soybean dwarf virus 
gil436022 (L24049) coat protein [Soybean dwarf virus] (Match DLV) 
>gil471720 (U01886) gB homolog [Gallid herpesvirus 2] (Match DAV) 
>gil323678 (M60583) ORF 1; putative [Densovirus of Bombyx type 1] (Match 
DYV) 

>gil305785 (LI 9242') glycoprotein 120 [Human immunodeficiency virus type 1] 
(Match DPV) 

>gil385141 (L23451) nonstructural protein 5 [Hepatitis C virus type 2b] (Match 
DPV) 

>gi!385143 (L23452) nonstructural protein 5 [Hepatitis C virus type 2b] (Match 
DPV) 

>gil385149 (L23455) nonstructural protein 5 [Hepatitis C virus type 2b] (Match 
DPV) 

>gil332344 (M24712) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332346 (M24713 ) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DKV) 

>gil332348 (M24714) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332350 (M24715) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332360 (M221 10) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil457315 (L23S28) RNA polymerase [Norwalk virus] (Match DGV) 
>gii295510 (L07937) 37 kDa protein [Soil-borne w heat mosaic virus] i Match 
DSV) 
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>gil465058lsplP?3S7SIVBr_VARV PROTEIN B17. gil4 19242lpirllI36856 B1SL 
protein - variola virus (strain India- 1967) gil6282 1 7lpirllS46S75 gene B17L protein 
(COP) - variola virus «il439093 (L22579) homolog of vaccinia vims CDS B17L: 
putative [variola major virus] gil457077 (X69198) pid:g457077 [Variola virus] 
cil5 16436 (X671 17) B17L COP gene product [Variola virus] gil8857S3 (L'18339) 
D6L [Variola virus] gil885845 (U 18341) B15L [Variola virus] gill 150675 
(X72086) ORF17L B18L in citation [3] [Variola virus] gil7453091prtll2015436HK 
B18L 2ene [Variola major virus] (Match DNV) 

>aiU38975lsplP15775IVSMP_CVBF PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.7 KD 
PROTEIN). gil74875lpirllMNIHB3 nonstmctural protein NS3 - bovine coronavirus 
(strain F15) gil58686 (X51347) NS3 protein (AA 1-84) [Bovine coronavims] 
(Match DDV) 

>sill38976lsplP15779IVSMP_CVBM PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.7 KD 
PROTEIN). gil418984ipirllD46346 nonstructural protein NS3 - bovine coronavims 
(strain Mebus) gil323368 (M31054) nonstructural 9.7 kDa protein (put.); putative 
[Bovine coronavirus] (Match DDV) 

>2il465439lsplQ04854IVSMP_CVHOC PUTATIVE SMALL MEMBRANE 
PROTEIN (NONSTRUCTURAL PROTEIN NS3) (NONSTRUCTURAL 9.5 KD 
PROTEIN). gil476391ipirllB44275 nonstructural protein NS3 - human coronavims 
(strain OC43) gil329569 (M99576) 9.5 kDa nonstructural protein [Human 
coronavirus] (Match DDV) 

>gil549520lsplP36566IYCBD_ECOLl HYPOTHETICAL 29.8 KD PROTEIN IN 
KDSB-KICB INTERGENIC REGION, gill 26 1828 (D26440) S- 
adenosvimethionine-dependent methltransferase [Escherichia coli] 
m!1585880lprfll220221 1 A Met(S-adenosyl )-dependem methyltransierase 
[Escherichia coli] (Match DKV) 

>eil465867lsplP34403IYLU9_CAEEL HYPOTHETICAL 14.8 KD PROTEIN 
F10E9.9 IN CHROMOSOME III. (Match DTV) 

>gilll9932lsplP002291FERl_PHYAM FERREDOXIN I. gil65749lpirllFEFWM 
terredoxin [2Fe-2S] I - Vinnnian pokevveed (Match DIV) 

>sill 19959lsplP14938IFER3_RAPSA FERREDOXIN. LEAF L-A. (Match DMV) 
>2iil30608lsplP05960IPOL_HVlC4 POL POLYPROTEIN (PROTEASE 
(RETROPEPSIN ) ; REVERSE TRANSCRIPTASE ; RIBONUCLEASE H. (Match 
DEV) 

>eill31765lsplP21760IQSP_CHICK QUIESCENCE-SPECIFIC PROTEIN 
PRECURSOR (P20K) (CH21 PROTEIN). gil864 1 7lpirllA3023() quiescence- 
specific protein precursor - chicken (Match DEV) 

>gil208939 (M 14181 ) preproparathyroid hormone [Artificial gene] gil209()49 
(M 14 182) preproparathyroid hormone [Artificial gene] gi!209052 (VI 14183) 
preproparathyroid hormone [Artificial genel (Match DMV) 
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>gil209048 (M 14182) synthetic preproparathyroid hormone [Artificial gene] 
gil209051 (M 141 S3 ) svnthetic preproparathvroid hormone | Artificial uene] iA latch 
DMV) 

>gil344735 < A04054) MDV gB gene product [unidentified] gi!4 12763 ( A0614") gB 

gene product [unidentified] (Match DAV) 

>gil221553 (D10134) NS-5 [Hepatitis C virus] (Match DPY> 

>gil234099lbbsl52140 NS3 protein [bovine enteritic coronavirus BECV. strain F15. 
Peptide. 84 aa] (Match DDV ) 

>gil256415lbbsl 1 14657 VP3=major structural polypeptide { N-terminal } [infectious 
flacherie virus IFV, silkworm Bombyx mori. Peptide Partial. 15 aa] (Match DIV) 
>gil454753 (U04469) polymerase [Desert Shield virus] (Match DGV) 
>gil 1364 1 35lpirllE49600 probable aphid transmission factor - soybean dwarf virus 
gil436022 (L24049) coat protein [Soybean dwarf virus] (Match DLV) 
>gil471720 (U01886) gB homolog [Gallid herpesvirus 2] (Match DAV) 
>giI323678 (M60583 ) ORF 1: putative [Densovirus of Bombvx tvpe 1] (Match 
DYV) 

>gil305785 (L 19242) glycoprotein 120 [Human immunodeficiency virus tvpe 1] 
(Match DPV) 

>gil385141 (L23451 ) nonstructural protein 5 [Hepatitis C virus tvpe 2b] (Match 
DPV) 

>gil385143 (L23452) nonstructural protein 5 [Hepatitis C virus tvpe 2b] (Match 
DPV) 

>gi!385149 (L23455) nonstructural protein 5 [Hepatitis C virus tvpe 2b] (Match 
DPV) 

>gil332344 (M24712) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332346 (M24713) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DKV) 

>gil332348 (M24714) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil332350 (M24715) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gi!332360 (M221 10) hemagglutinin-neuraminidase [Newcastle disease virus] 
(Match DGV) 

>gil457315 (L23S28) RNA polymerase [Norwalk virus] (Match DGV) 
>gil295510 (L07937) 37 kDa protein [Soil-borne wheal mosaic virus] (Match 
DSV) 
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>sil4331 1? (U03762) muhigene family 360 protein [African swine fever virus] 
(Match DTV) 

>siil76494lpirllS09759 hypothetical protein TRL10 precursor - human 
cytomegalovirus <strainAD169) gi!59601 (X 17403) HCMVTRLK) = IRLH) (AA 
1-171) [Human cytomegalovirus] (Match DNV) 
>gil21 1503 (M55644) marker protein [Gallus gallus] (Match DEV) 
>ei!576796 (M25784) quiescence-specific protein [Gallus gallus] (Match DEV) 
>gil509165 (X70945) cellular retinoic acid binding protein I [Ambystoma 
mexicanum] (Match DDV) 

>gil227060tprflll613430A nmK assocd ORF [Escherichia coli] (Match DQV) 
>gil76336lpirllCOSJS 1G2 protein - bracket fungus (Schizophyllum commune) 
2il224 1 SOIprfll 101 1 193A 1G2 gene ORF [Schizophvllum commune] (Match DPV) 
>gill346547lsplP48040!MLl A_SHEEP MELATONIN RECEPTOR TYPE 1A 
(MEL-1A-R). gil602132 (U 14109) Mel- la melatonin receptor [Ovis anes] (Match 
DSV) 

>gil625362lpirllA61338 flavodoxin - Clostridium pasteurianum (fragment) (Match 
DVV) 

>gil6259831pirllJC2251 S-locus-specific glycoprotein S8 precursor - field mustard 
gill30401 1 (D84468) SLG8 [Brassica campestris] (Match DLV) 
>gil628958lpirllS45092 cops protein - Streptococcus pyogenes gill 333835 
(X66468 ) copS gene product [Streptococcus pyogenes] (Match DFV) 
>gil629545lpirllS40470 protein kinase 4. mitogen-activated - Arabidopsis thaliana 
(Match DSV) 

>gill 07646 llpirllSSl 139 S locus glycoprotein - wild cabbage gil624941 (X79431 ) S 
locus glycoprotein [Brassica oleracea] (Match DLV) 

>eil60 1 8 1 2lbbsl 1 5 1 834 (S7201 \ ) P14=small low-abundant nonstructural protein 
[bacteriophage, phi 6. Peptide. 62 aa] (Match DGV) 

>gil632906lbbsl 152232 RNA polymerase [human enteric calicivirus HCV. Peptide 
Partial. 54 aa] (Match DGV) 

>gil676884 (D29681) The expression is induced by Pi starvation. [Nicotiana 
tabacum] gill094819lprfll2106387C Al-induced protein [Nicotiana tabacum] 
(Match DRV) 

>gil729540lsplP80348IFUC2_RAT FUCTININ 2 (FUCOSYLTRANSFERASE 
INHIBITOR 2). gil639583lbbsl 155067 fuctimn peptide 2=fucosyltransferase 
inhibitor { N-terminal } [rats, small intestinal mucosa. Peptide Partial. 22 aa] (Match 
DEV) 

>gil755077 (L34630) membrane protein [Synechocystis sp.] gil 1653000 (D90910) 
Mn transporter MntB [Synechocystis sp.] (Match DQV) 

>eil765093 (D50053) ORF5 [Orgyia pseudotsugata nuclear polyhedrosis virus] 
2ill584397lprfll2 12242 IB ORF 5 [Orgyia pseudotsugata nuclear polyhedrosis 
virus] (Match DKV) 
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>gii765256ibbs! 1566S2 (S73S13) lymphoid cell activation antigen. 
CD?9=guanosine diphosphatase homolog [human. B lymphoblastoid cell line. MP- 
1. Peptide. 510 aaj (Match DMV) 

>giil0839l6ipirllJC2572 hypothetical 18K protein - Leuconostoc oenos phage L10 

gil806612 (L13035) OR FA [Bacteriophage L10] (Match DDV) 

>gil808689 (M19004) unknown protein [Saimirine herpesvirus 1] (Match DWVi 

>gil261755lbbsll22153 aconitase. iron-responsive element binding protein. IRE-BP 

{EC 4.2.1.3} [cattle, liver cytosol. Peptide Partial. 1 1 aa. segment 4 of 6] (Match 

DVV) 

>gil544869lbbsl 142782 beta-glucosidase [Hordeum vulgare=barley. Sofia. Peptide 
Partial. 15 aa. segment 2 of 6] (Match DGV) 

>gil400168lspllLCAT_PIG_10 [Segment 10 of 11] PHOSPHATIDYLCHOLINE- 
STEROL ACYLTRANSFERASE PRECURSOR (LECITHIN-CHOLESTEROL 
ACYLTRANSFERASE) (PHOSPHOLIPID-CHOLESTEROL 
ACYLTRANSFERASE). (Match DPV) 

>gil400776lspllPHLD_HUMAN_6 [Segment 6 of 8] PHOSPHATIDYLINOSITOL- 
GLYCAN-SPECIFIC PHOSPHOLIPASE D (PI-G PLD) (GLYCOPROTEIN 
PHOSPHOLIPASE D). (Match DXV) 

>gil860940 (X78951) core protein [Hepatitis C virus] (Match DGV) 
>gil881414 (U27512) trichocyst matrix protein T4 [Paramecium tetraurelia] 
gil881416 (U27513) trichocvst matrix protein T4 [Paramecium tetraurelia] (Match 
DKV) 

>gil881418 CU27514) trichocvst matrix protein T4 [Paramecium tetraurelia] (Match 
DKV) 

>gill361418lpirllS57659 argininosuccinate synthase (EC 6.3.4.5 ) - Streptomyces 
clavuligerus gil886906 (Z491 1 1 ) argininosuccinate synthetase [Streptomyces 
clavuligerus] gill 58651 liprfll2204224A argininosuccinate synthetase 
[Streptomyces clavuligerus] (Match DLV) 

>gil899227 (X03170) SLSG (COOH end); pid:e 188274 [Brassica oleracea] (Match 
DLV) 

>gil913953lbbsll64394 threonine dehydrogenase. TDH {N-terminal} {EC 
1.1.1.103} [Clostridium sticklandii. DSM 519T. ATCC 12662. Peptide Partial. 30 
aa] (Match DNV") 

>gi!65750ipirllFEFWF ferredoxin [2Fe-2S] I - food pokevveed (Match DIV) 
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>aii4lQ454lpirllH46328 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
dfsease virus (strain AUS/32) (Match DGV) 

>2il419463lpirllI46328 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
dfsease virus (strain MIY/5 1 ) (Match DKV) 

>2il419461lpirllB36S29 hemagglutinin-neuraminidase (EC 3.2.1.-) - Newcastle 
disease virus (strain ITA/45) (Match DGV) 

>gil81707lpirllJX00S2 lerredoxin [2Fe-2S] A. leaf - radish (Match DVV) 
>gil81752lpirllS0663 1 lectin - coral tree (Match DAV) 

>2il541283lpirllB49850 chlorin reductase subunit BchX - Rhodobacter capsulams 
(Match DDV) 

>ail99563lpirliA28965 ribulose-bisphosphate carboxylase (EC 4.1.1.39) large chain 
- spinach (fragments) (Match DTV) 

>gil81701ipirilS04906 S-locus-specific glycoprotein S29-2 precursor - wild cabbage 
(fragment) (Match DLV) 

>gil89156lpirllA0531 1 apolipoprotein A-I - pig (fragment) (Match DRV) 
>sil89263lpirllB29544 phosphatidylcholine-sterol O-acyltransferase (EC 2.3.1.43) 
peptide B - pig (fragment) (Match DPV) 

>sil911252lpatlUSI541 1941110 Sequence 10 from Patent US 541 1941 
giil6080261patlUSI5508263H0 Sequence 10 from patent US 5508263 (Match 
DMV) 

>gi!91 1989lpatlUSI5422424ll Sequence 1 from patent US 5422424 (Match DLV) 
>eill067941pirllS17112 interferon alpha/beta receptor - human (Match DFV) 
>gilll75245lsplP43996IY421_HAEIN HYPOTHETICAL PROTEIN HI0421. 
g ill074400lpirilE64007 hypothetical protein HI0421 - Haemophilus influenzae 
Tstrain Rd KW20) gill 573 398 (U32725) H. influenzae predicted coding region 
HI0421 [Haemophilus influenzae] (Match DKV) 

>gilll76329lsplP44812IYIIU_HAEIN HYPOTHETICAL PROTEIN HI0668. 
gijl074476lpirllD64156 hypothetical protein HI0668 - Haemophilus influenzae 
Tstrain Rd KW20) gill 573669 (U32750) hypothetical [Haemophilus influenzae] 
(Match DNV) 

>sil927494 (X89861) 9.6 kDa nonstructural protein [Porcine hemagglutinating 
encephalomyelitis coronavirus] (Match DDV) 

>2il927497 (X89863) 9.6 kDa nonstructural protein [Porcine hemagglutinating 
encephalomyelitis coronavirus] (Match DDV) 

>sil927500 (X89862) 9.6 kDa nonstructural protein [Porcine hemagglutinating 
encephalomyelitis coronavirus] (Match DDV) 

>gil947124lbbsl 163644 lerredoxin component al [Raphanus sativus var. 
longipinnatus=Chinese radish, leaves, seedlings. Peptide, 96 aa] (Match DVV ) 
>giil363938lpirllS53870 metalloproteinase-3 tissue inhibitor - human 
cfl9573 lOlbbs! 165606 hTIMP-3=tissue inhibitor of metalloproteinase-3 {N- 
terminal) [human. Peptide Partial. 18 aa] (Match DIV) 
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>gil c >71666 (F14634) rho protein dissociation inhibitor homolog [Sus scrota] 
(Match DIV) 

>gil995573 (U03772) putative transposase [Acineiobacter calcoaccticus J (Match 
DTV) 

>sil995574 (U03772) ORF2 gene product [ Acinetobactcr calcoaccticus] (Match 
DTV) 

>gil998292 (U33482) ependymin [Gasteropelecus sp.] (Match DGV) 
>2il998306 (U33487) ependvmin [Nannobrycon sp.] (Match DGV) 
>gill346543lsplP49285IMLl A_CH1CK MELATONIN RECEPTOR TYPE 1A 
(MEL-1 A-R). gill 000 104 (U31820) Mel- la melatonin receptor [Gallus gallusj 
(Match DSV) " 

>gill001 1 10 CD 64001") hypothetical protein [Synechocystis sp.] CMatch DSV) 
>gill001 172 (D64001) hypothetical protein [Synechocystis sp.] CMatch DGV) 
>gill001295 CD64006) hypothetical protein [Synechocystis sp.] CMatch DPV) 
>gil 10 16694 CU33011 ) urease accessory protein G [Mycobacterium tuberculosis] 
gill583729lprfll2121356E urease:SUBUNIT=G [Mycobacterium tuberculosis] 
(Match DGV) 

>gil 104201 llbbsl 169021 (S78693) cyclic AMP response element-binding protein- 1 
alpha isoform= alpha CREB-1 (alternatively spliced, internal fragment} [human, 
placenta. Peptide Partial. 21 aa] (Match DSV) 

>gil 1050760 (X83665) ribulose- 1.5-bisphosphate carboxylase [Rogiera 
suffrutenscens] (Match DPV) 

>gill051 157 (X91985) glycoprotein 100 [Marek disease virus type 1] CMatch 
DAV) 

>gill052601 (X82442) pid:el22803 [Gallus gallus] CMatch DGV) 

>gil 106 1312 CM8766D nonstructural polyprotein [Norwalk calicivirus] (Match 

DGV) 

>gill-351660lsplQ09907IYAJ7_SCHPO HYPOTHETICAL 40.2 KD PROTEIN 
C30D 1 1 .07 IN CHROMOSOME I. gii 1 065894 (Z6796 1 ) unknown 
[Schizosaccharomvces pombe] (Match DLV) 

>gill353146lsplQ09637iYRl 1 _C AEEL PROBABLE PEPTIDYL-PROLYL CIS- 
TRANS ISOMERASE T27D1.1 (PPIASE) (ROTAMASE). (Match DLV) 
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>gill()71799lpirllPA0003 nucleoside-diphosphate kinase (F.C 2.7.4.6) - Arabidopsis 
thaliana (fragment) (Match DGV) 

>gil 108335 llpirllPC2239 heat shock protein, high-molceular-mass 105B - mouse 
(fragments) (Match DMV) 

>gifl0S3905lpirllA55209 H transfer determinant A - plasmid R478 gill 326033 

(L20341 ) IncHI2 transfer repressor [Plasmid R478] (Match DEV) 

>gill 100235 (L48985) resolvase [Pseudomonas syringaej (Match DKV) 

>sill 122533 (U39944) BELLI [Arabidopsis thaliana] (Match DIV) 

>eill 176915lsplP42955IYSLB_BACSU HYPOTHETICAL 17.3 KD PROTEIN IN 

LYSC 3'REGION. gill 129090 (J03294) ORF; putative [Bacillus subtilis] (Match 

DPV) 

>gill 139612 (U43400) structural phosphoprotein [Human herpesvirus 7] (Match 
DVV) 

>gill 146150 (L43365) fiber protein [Human adenovirus type 2] (Match DGV) 
>giil 150923 (X94677) major DNA binding protein [Bovine herpesvirus 1] 
gill491628lgnllPIDIe258523 (Z78205) UL29 [Bovine herpesvirus 1] (Match DMV) 
>gill 160339 (U21000) MerR [Pseudomonas smtzeri] gill 586 135lprfll2203290A 
merR gene [Pseudomonas smtzeri] (Match DAV) 

>gill 163120 (U43537) ORF1; putative ABC excision nuclease repair protein 
[Streptomyces argillaceus] (Match DAV) 

>gill 164905 (X83637) ribulose-1.543isphosphate carboxylase, large subunit 
[Gardenia thunbergia] (Match DKV) 

>gill 1714621bbsll71023 SnaA=pristinamycin IIA synthase 50 kda subunit {N- 
terminal. internal fragment} [Streptomyces pristinaespiralis. SP92. ATCC 25486. 
Peptide Partial. 20 aa. segment 2 of 2] (Match DFV) 

>gill 173549 (U31208) NADH dehydrogenase type 1 subunit [Anabaena sp.] 
(Match DWV) 

>gill 181520 (U42580) A357L [Paramecium bursaria Chlorella virus 1] (Match 
DFV) 

>2illl72748lsplP36672IPTTB_ECOLI PTS SYSTEM. TREHALOSE-SPECIFIC 
IIBC COMPONENT (EIIBC-TRE) (TREHALOSE-PERMEASE IIBC 
COMPONENT) (PHOSPHOTRANSFERASE ENZYME II. BC COMPONENT) 
(EII-TRE). (Match DIV) 

>gil 1204 170 (Z69729) unknown [Schizosaccharomyccs pombe] (Match DNV) 
>gill213262 (Z69795) unknown [Schizosaccharomyces pombe] (Match DSV) 
>gill213627 (X95939) type I interferon receptor [Ovis aries] (Match DSV) 
>gil 12202 17 (U49425) Lucilia cuprina beta esterase-related carboxylesterase 
(Lc79) gene, partial cds [Lucilia cuprina] (Match DGV) 

>gill225955lgnllPIDIe228613 (Z70177) homologous to yqbR of the skin element 
[Bacillus subtilis] (Match DKV) 
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1 1 235S28 (Ul 1972) emml gene product [Streptococcus pyogenes] (Match DEY 
11235842 (Ul 1998) emml gene product [Streptococcus pyogenes] (MatchX)T\' 
11236788 (L0741S) polyprotein [Southampton virus] (Match DGV) 
11244418 (L'26382) VP7 [group A rotavirus] (Match DRV) 
H246922lgnllPIDIel993(jf(A27292) 21B4 [Babesia bovis] (Match DFV) 
U254543lpatlUSI5486595!S Sequence S from patent US 5486595 (Match DTV) 
>gill262126lgnllPIDIe235301 (Z7060 1 ) nonstructural protein 1 [Ervthrovirus B 19] 
(Match DLV) 

>gil 1293022 (U50250) ribulose- 1 .5-bisphosphate carboxylase/oxygenase large 
subunit [Panax quinquefolius] (Match DVV) 

>gill304009 (D84469) SLG12 [Brassica campestns] (Match DLV) 
>gill350495 (L47606) ABA-responsive and embryogenesis-associated gene: lea- 
like protein [Picea glauca] (Match DYV) 

>gill360115lgnllPIDIe213272 (Z68147) glycoprotein B equivalent [Phocine 
herpesvirus type 1] (Match DEV) 

>gill352474lsplP80507IIPYR_SYNY3 INORGANIC PYROPHOSPHATASE 
(PYROPHOSPHATE PHOSPHO-HYDROLASE) (PPASE). (Match DRV) 
>gill360894lpirllS54285 phosphoslvcerate kinase - Thermotoea maritima (Match 
DGV) 

>gil 1399 179 ( U49426) 120 kDa immunodominant surface protein [Ehrlichia 
chaffeensis] (Match DIV) 

>gil 1399491 (U49666) Glp repressor [Pseudomonas aeruginosa] (Match DLV) 
>gill435070lgnllPIDIe253922 (X99085) intesrase [Ascobolus immersus] (Match 
DYV) 

>gill458198 (U63197) helicase [Hepatitis GB virus C] gii 145S200 (U63198) 
helicase [Hepatitis GB virus C] gil 1458216 (U63206) helicase [Hepatitis GB virus 
C] gill458218 (U63207) helicase [Hepatitis GB virus C] gil 1458222 (U63209) 
helicase [Hepatitis GB virus C] (Match DSV) 

>gil 1458202 (U63199) helicase [Hepatitis GB virus C] (Match DSV) 
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>gill45S204 (U63200) helicase [Hepatitis GB virus C] (Match DSV) 
>2il 145S206 (U63201) helicasc [Hepatitis GB virus C] (Match DSV) 
>gill45S208 (U63202) helicase [Hepatitis GB virus C] gil 1458220 (U63208) 
helicase [Hepatitis GB virus C] (Match DSV) 

>gil 1458210 (U63203) helicase [Hepatitis GB virus C] (Match DSV) 
>gill458212 (U63204) helicase [Hepatitis GB virus C] (Match DSV) 
>gill458214 (U63205) helicase [Hepatitis GB virus C] (Match DSV) 
>gill458224 (U63210) helicase [Hepatitis GB virus C] (Match DSV) 
>gill4S0344lgnllPIDIe254807 (X99405) glucose-6-phosphate dehydrogenase 
[Nicotiana tabacum] (Match DLV) 

>gill31 1403ipdbllAUSIL Activated Unliganded Spinach Rubisco Mol_id: 1: 
Molecule: Ribulose Bisphosphate CarboxylaseOXYGENASE: Chain: L, S; 
Synonym: Rubisco; Ec: 4.1.1.39; Heterogen: Carbon Dioxide: Heterogen: 
Magnesium (Match DTV) 

>gi!1491736lgnllPIDIe223596 (X95287) archaeal ABC-transporter system 
[Methanosarcina mazeii] (Match DAV) 

>gill592296 (U67506) M. jannaschii predicted coding region MJ0568 
[Methanococcus jannaschii] (Match DKV) 

>gill518406lgnllPIDIe220405 (Z69198) ribulose- 1.5-bisphosphate carboxylase, 
large subunit [Triteleia bridgesii] (Match DLV) 
>gFll518698 (U61753) C3-3 [Oncorhynchus mykiss] (Match DVV) 
>gill526499 (D87414) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV ) 

>gil 1526505 (D87417) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gill526525 (D87427) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gil 1526527 (D87428) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gill526529 (D87429) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gil 1526531 (D87430) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gill526533 (D87431) MHC class II histocompatibility antigen [Sus scrofa] 
(Match DTV) 

>gill545998 (U60650) polyprotein [Drosophila \ virus] (Match DIV) 
>gil 1553002 (U65978) interferon alpha/beta receptor- 1 [Ovis aries] (Match DSV) 
>gi!1567698lgnllPIDIe254689 (A32883) thrombin inhibitor protein [Rhodnius 
prohxus] gill610446lpatlUSI5523287!5 Sequence 5 from patent US 5523287 
(Match DPV) 
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>gill:>6'" 7 00lgnliPIDIe254629 i A328SM thrombin inhibitor protein [Rhodnius 
prolixusj gill610447ipatlUS:5523287r Sequence 7 from patent US 5523287 
(Match DPV) 

>gill56~702lgnllPIDIe254631 (A32S87) thrombin inhibitor protein [Rhodnius 
prolixus] gill610448lpatll'SI5523287!9 Sequence 9 from patent US 5523287 
(Match DPV) 

>gill567704lgnllPIDIe254632 ( A32889) thrombin inhibitor protein [Rhodnius 
prolixus] gill610449lpatlUSI5523287ll 1 Sequence 1 1 from patent US 5523287 
(Match DPV) 

>gill567706lgnllPIDIe254633 (A3289D thrombin inhibitor protein [Rhodnius 
prolixus] gill610450lpatlUSI5523287U3 Sequence 13 from patent US 5523287 
(Match DPV) 

>gill567708lgnllPIDIe254634 (A32893) thrombin inhibitor protein [Rhodnius 
prolixus] gill610451lpatlUSI5523287U5 Sequence 15 from patent US 5523287 
(Match DPV) 

>gil 15677 101gnllPIDIe254691 (A32895) thrombin inhibitor protein [Rhodnius 
prolixus] gill610452lpatlUSI5523287H7 Sequence 17 from patent US 5523287 
(Match DPV) 

>gil 1575524 (U65005) structural phosphoprotein [Human herpesvirus 7] (Match 
DVV) 

>gill607344lpatlUSI5500347l2 Sequence 2 from patent US 5500347 (Match DKV) 
>gill607345lpatlUSI5500347l3 Sequence 3 from patent US 5500347 (Match DKV) 
>gill607346lpatlUSI5500347l4 Sequence 4 from patent US 5500347 (Match DKV) 
>gill607348ipatlUSI5500347l6 Sequence 6 from patent US 5500347 (Match DKV) 
>gill607349lpatlUSI5500347l7 Sequence 7 from patent US 5500347 (Match DKV) 
>gill608953lpatlUSI55 10461 19 Sequence 9 from patent US 5510461 (Match DHV) 
>gil 161 0343 IpatlUS 155 2 1 07 1 1 2 Sequence 2 from patent US 5521071 (Match DLV) 
>gill610926ipatlUSI5527773l3 Sequence 3 from patent US 5527773 (Match DKV) 
>gill610980lpatlUSI5527896l56 Sequence 56 from patent US 5527896 (Match 
DMV) 

>gill610981lpatlUSI5527896l57 Sequence 57 from patent US 5527896 (Match 
DIV) 

>gill61 1666lpatlUS15539092l98 Sequence 98 from patent US 5539092 (Match 
DKV) 

>2ill613384lpatlUSI55590O8l67 Sequence 6" from patent US 5559008 (Match 
DD V ) 

>gill613387lpatlUSI5559OO8l70 Sequence 70 from patent US 5559008 (Match 
DDV) 

>gill587874lprfll2207325A Antl gene [Aspergillus niger] (Match DEV) 
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>gill648974 (U66481) PI structural protein [Hepatitis A virus] gil 1648990 

(U66489) PI structural protein [Hepatitis A virus] (Match DPV) 

>gil 1648976 (U66482) PI structural protein [Hepatitis A virus] gil 1648986 

(U66487) PI structural protein [Hepatitis A virus] (Match DPV) 

>gil 1648978 (U66483) PI structural protein [Hepatitis A virus] (Match DPV) 

>gil 1648980 (U66484) PI structural protein [Hepatitis A virus] (Match DPV) 

>gil 1648982 (U66485) PI structural protein [Hepatitis A virus] (Match DPV) 

>gil 1648984 (U66486) PI structural protein [Hepatitis A virus] (Match DPV) 

>gill648988 (U66488) PI structural protein [Hepatitis A virus] (Match DPV) 

>gill651445 (D90730) Hypothetical 29.8 KD protein in kdsB-kicB intergenic 

region [Escherichia coli] (Match DKV) 

>gill651926 (D90901) hypothetical protein [Synechocystis S p.] (Match DLV) 
>gil 165 1969 (D90901) hypothetical protein [Synechocystis sp.] (Match DDV) 
>gill652043 (D90902) hypothetical protein [Synechocystis sp.] (Match DLV) 
>gill653351 (D90913) HlyB family [Synechocystis sp.] (Match DDV) 
>gill654110 (U14110) melatonin receptor Mel-la [Phodopus sungorus] (Match 
DSV) 

>gil 1655822 (U59320) heat shock protein 60 [Leishmania major] (Match DEV) 
>gi! 1657485 (U73857) similar to E. coli o765 [Escherichia coli] (Match DVV) 
>gill658269 (U74670) 120 kDa immunodominant surface protein [Ehrlichia 
chaffeensis] (Match DIV) 
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